haikuwebkit/LayoutTests/js/invalid-utf8-in-syntax-erro...

11 lines
382 B
Plaintext
Raw Permalink Normal View History

JS parser incorrectly handles invalid utf8 in error messages. https://bugs.webkit.org/show_bug.cgi?id=158128 Reviewed by Saam Barati. Source/JavaScriptCore: The bug here was caused by us using PrintStream's toString method to produce the error message for a parse error, even though toString may produce a null string in the event of invalid utf8 that causes the error in first case. So when we try to create an error message containing the invalid character code, we set m_errorMessage to the null string, as that signals "no error" we don't stop parsing, and everything goes down hill from there. Now we use the new toStringWithLatin1Fallback so that we can always produce an error message, even if it contains invalid unicode. We also add an additional fallback so that we can guarantee an error message is set even if we're given a null string. There's a debug mode assertion to prevent anyone accidentally attempting to clear the message via setErrorMessage. * parser/Parser.cpp: (JSC::Parser<LexerType>::logError): * parser/Parser.h: (JSC::Parser::setErrorMessage): Source/WTF: Add a new toStringWithLatin1Fallback that simply uses String::fromUTF8WithLatin1Fallback, so we can avoid the standard String::fromUTF8 null return. * wtf/StringPrintStream.cpp: (WTF::StringPrintStream::toStringWithLatin1Fallback): * wtf/StringPrintStream.h: LayoutTests: Add a testcase. * js/invalid-utf8-in-syntax-error-expected.txt: Added. * js/script-tests/invalid-utf8-in-syntax-error.js: Added. Canonical link: https://commits.webkit.org/176410@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@201624 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-06-02 23:07:48 +00:00
Ensures that we correctly propagate the error message for lexer errors containing invalid utf8 code sequences
On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
WebKit has too much of its own UTF-8 code and should rely more on ICU's UTF-8 support https://bugs.webkit.org/show_bug.cgi?id=195535 Patch by Darin Adler <darin@apple.com> on 2019-05-01 Reviewed by Alexey Proskuryakov. LayoutTests/imported/w3c: * web-platform-tests/encoding/textdecoder-utf16-surrogates-expected.txt: Updated expected results to have the Unicode replacement character in cases where the text contains unpaired surrogates. The tests are still doing the same operations, and still getting the same results, but the text output no longer includes illegal UTF-8. Source/JavaScriptCore: * API/JSClassRef.cpp: Removed uneeded include of UTF8Conversion.h. * API/JSStringRef.cpp: (JSStringCreateWithUTF8CString): Updated for changes to convertUTF8ToUTF16. (JSStringGetUTF8CString): Updated for changes to convertLatin1ToUTF8. Removed unneeded "true" to get the strict version of convertUTF16ToUTF8, since that is the default. Also updated for changes to CompletionResult. * runtime/JSGlobalObjectFunctions.cpp: (JSC::decode): Stop using UTF8SequenceLength, and instead use U8_COUNT_TRAIL_BYTES and U8_MAX_LENGTH. Instead of decodeUTF8Sequence, use U8_NEXT. Also use U_IS_BMP, U_IS_SUPPLEMENTARY, U16_LEAD, U16_TRAIL, and U_IS_SURROGATE instead of our own equivalents, since these macros from ICU are correct and efficient. * wasm/WasmParser.h: (JSC::Wasm::Parser<SuccessType>::consumeUTF8String): Updated for changes to convertUTF8ToUTF16. Source/WebCore: * platform/SharedBuffer.cpp: (WebCore::utf8Buffer): Removed unnecessary "strict" argument to convertUTF16ToUTF8 since that is the default behavior. Also updated for changes to return values. * xml/XSLTProcessorLibxslt.cpp: (WebCore::writeToStringBuilder): Removed unnecessary use of StringBuffer for a temporary buffer for characters. Rewrote to use U8_NEXT and U16_APPEND directly. * xml/parser/XMLDocumentParserLibxml2.cpp: (WebCore::convertUTF16EntityToUTF8): Updated for changes to CompletionResult. Source/WebKit: * Shared/API/APIString.h: Removed uneeded includes and also switched to #pragma once. * Shared/API/c/WKString.cpp: Moved include of UTF8Conversion.h here. (WKStringGetUTF8CStringImpl): Updated for changes to return values. Source/WTF: * wtf/text/AtomicString.cpp: (WTF::AtomicString::fromUTF8Internal): Added code to compute string length when the end is nullptr; this behavior used to be implemented inside the calculateStringHashAndLengthFromUTF8MaskingTop8Bits function. * wtf/text/AtomicStringImpl.cpp: (WTF::HashAndUTF8CharactersTranslator::translate): Updated for change to convertUTF8ToUTF16. * wtf/text/AtomicStringImpl.h: Took the WTF_EXPORT_PRIVATE off of the AtomicStringImpl::addUTF8 function. This is used only inside a non-inlined function in the AtomicString class and its behavior changed subtly in this patch; it's helpful to document that it's not exported. * wtf/text/StringImpl.cpp: (WTF::StringImpl::utf8Impl): Don't pass "true" for strictness to convertUTF16ToUTF8 since strict is the default. Also updated for changes to ConversionResult. (WTF::StringImpl::utf8ForCharacters): Updated for change to convertLatin1ToUTF8. (WTF::StringImpl::tryGetUtf8ForRange const): Ditto. * wtf/text/StringView.cpp: Removed uneeded include of UTF8Conversion.h. * wtf/text/WTFString.cpp: (WTF::String::fromUTF8): Updated for change to convertUTF8ToUTF16. * wtf/unicode/UTF8Conversion.cpp: (WTF::Unicode::inlineUTF8SequenceLengthNonASCII): Deleted. (WTF::Unicode::inlineUTF8SequenceLength): Deleted. (WTF::Unicode::UTF8SequenceLength): Deleted. (WTF::Unicode::decodeUTF8Sequence): Deleted. (WTF::Unicode::convertLatin1ToUTF8): Use U8_APPEND, enabling us to remove almost everything in the function. Also changed resturn value to be a boolean to indicate success since there is only one possible failure (target exhausted). There is room for further simplification, since most callers have lengths rather than end pointers for the source buffer, and all but one caller supplies a buffer size known to be sufficient, so those don't need a return value, nor do they need to pass an end of buffer pointer. (WTF::Unicode::convertUTF16ToUTF8): Use U_IS_LEAD, U_IS_TRAIL, U16_GET_SUPPLEMENTARY, U_IS_SURROGATE, and U8_APPEND. Also changed behavior for non-strict mode so that unpaired surrogates will be turned into the replacement character instead of invalid UTF-8 sequences, because U8_APPEND won't create an invalid UTF-8 sequence, and because we don't need to do that for any good reason at any call site. (WTF::Unicode::isLegalUTF8): Deleted. (WTF::Unicode::readUTF8Sequence): Deleted. (WTF::Unicode::convertUTF8ToUTF16): Use U8_NEXT instead of inlineUTF8SequenceLength, isLegalUTF8, and readUTF8Sequence. Use U16_APPEND instead of lots of code that does the same thing. There is room for further simplification since most callers don't need the "all ASCII" feature and could probably pass the arguments in a more natural way. (WTF::Unicode::calculateStringHashAndLengthFromUTF8MaskingTop8Bits): Use U8_NEXT instead of isLegalUTF8, readUTF8Sequence, and various error handling checks for things that are handled by U8_NEXT. Also removed support for passing nullptr for end to specify a null-terminated string. (WTF::Unicode::equalUTF16WithUTF8): Ditto. * wtf/unicode/UTF8Conversion.h: Removed UTF8SequenceLength and decodeUTF8Sequence. Changed the ConversionResult to match WebKit coding style, with an eye toward perhaps removing it in the future. Changed the convertUTF8ToUTF16 return value to a boolean and removed the "strict" argument since no caller was passing false. Changed the convertLatin1ToUTF8 return value to a boolean. Tweaked comments. LayoutTests: * css3/escape-dom-api-expected.txt: * fast/text/dangling-surrogates-expected.txt: * js/dom/webidl-type-mapping-expected.txt: * js/invalid-utf8-in-syntax-error-expected.txt: Updated expected results to have the Unicode replacement character in cases where the text contains unpaired surrogates. The tests are still doing the same operations, and still getting the same results, but the text output no longer includes illegal UTF-8. * js/invalid-utf8-in-syntax-error.html: Added. Before adding this, the test was run, but unlike the rest of the tests in this directory, was only run as part of run-javascriptcore-tests. There are two reasons for adding this. One is to be consistent with the rest of the tests here and run a second time as part of the broader WebKit tests. The second is that we can now use "--reset-results" to generate new expected results, something that run-webkit-tests has but run-javascriptcore-tests does not have. Canonical link: https://commits.webkit.org/211641@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@244828 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2019-05-01 17:33:03 +00:00
PASS ({f("<22>")}) threw exception SyntaxError: Unexpected string literal "<22>". Expected a parameter pattern or a ')' in parameter list..
JS parser incorrectly handles invalid utf8 in error messages. https://bugs.webkit.org/show_bug.cgi?id=158128 Reviewed by Saam Barati. Source/JavaScriptCore: The bug here was caused by us using PrintStream's toString method to produce the error message for a parse error, even though toString may produce a null string in the event of invalid utf8 that causes the error in first case. So when we try to create an error message containing the invalid character code, we set m_errorMessage to the null string, as that signals "no error" we don't stop parsing, and everything goes down hill from there. Now we use the new toStringWithLatin1Fallback so that we can always produce an error message, even if it contains invalid unicode. We also add an additional fallback so that we can guarantee an error message is set even if we're given a null string. There's a debug mode assertion to prevent anyone accidentally attempting to clear the message via setErrorMessage. * parser/Parser.cpp: (JSC::Parser<LexerType>::logError): * parser/Parser.h: (JSC::Parser::setErrorMessage): Source/WTF: Add a new toStringWithLatin1Fallback that simply uses String::fromUTF8WithLatin1Fallback, so we can avoid the standard String::fromUTF8 null return. * wtf/StringPrintStream.cpp: (WTF::StringPrintStream::toStringWithLatin1Fallback): * wtf/StringPrintStream.h: LayoutTests: Add a testcase. * js/invalid-utf8-in-syntax-error-expected.txt: Added. * js/script-tests/invalid-utf8-in-syntax-error.js: Added. Canonical link: https://commits.webkit.org/176410@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@201624 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2016-06-02 23:07:48 +00:00
PASS successfullyParsed is true
TEST COMPLETE