haikuwebkit/LayoutTests/js/unicode-escape-sequences-ex...

97 lines
9.0 KiB
Plaintext
Raw Permalink Normal View History

[ES6] Implement Unicode code point escapes https://bugs.webkit.org/show_bug.cgi?id=144377 Reviewed by Antti Koivisto. Source/JavaScriptCore: * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from the header. Made it a non-member class so it doesn't need to be part of a template. Made it use UChar32 instead of int for the value to make it clearer what goes into this class. (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the old type() function. (JSC::Lexer<CharacterType>::parseUnicodeEscape): Renamed from parseFourDigitUnicodeHex and added support for code point escapes. (JSC::isLatin1): Added an overload for UChar32. (JSC::isIdentStart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity. Also added FIXME about a subtle ES6 change that we might want to make later. (JSC::isIdentPart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we don't need to repeat the code twice. Added code to handle code point escapes. (JSC::isIdentPartIncludingEscape): Call the template instead of having the code in line. (JSC::Lexer<CharacterType>::recordUnicodeCodePoint): Added. (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): Made small tweaks and updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex. (JSC::Lexer<CharacterType>::parseComplexEscape): Call parseUnicodeEscape instead of parseFourDigitUnicodeHex. Move the code to handle "\u" before the code that handles the escapes, since the code point escape code now consumes characters while parsing rather than peeking ahead. Test case covers this: Symptom would be that "\u{" would evaluate to "u" instead of giving a syntax error. * parser/Lexer.h: Updated for above changes. * runtime/StringConstructor.cpp: (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Source/WebCore: Test: js/unicode-escape-sequences.html * css/CSSParser.cpp: (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of writing out 0xFFFD. * html/parser/HTMLEntityParser.cpp: (WebCore::isAlphaNumeric): Deleted. (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior, but maye it's something we want to do in the future. (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead of a the function in this file that does the same thing less efficiently. * html/parser/InputStreamPreprocessor.h: (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use replacementCharacter from CharacterNames.h instead of writing out 0xFFFd. * xml/parser/CharacterReferenceParserInlines.h: (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of defining our own local highestValidCharacter constant. LayoutTests: * js/script-tests/unicode-escape-sequences.js: Added. * js/unicode-escape-sequences-expected.txt: Added. * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers. Canonical link: https://commits.webkit.org/162396@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@183552 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-04-29 16:33:12 +00:00
Test of Unicode escape sequences in string literals and identifiers, especially code point escape sequences.
On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
PASS codeUnits("\u{0}") is "0000"
PASS codeUnits("\u{41}") is "0041"
PASS codeUnits("\u{D800}") is "D800"
PASS codeUnits("\u{d800}") is "D800"
PASS codeUnits("\u{DC00}") is "DC00"
PASS codeUnits("\u{dc00}") is "DC00"
PASS codeUnits("\u{FFFF}") is "FFFF"
PASS codeUnits("\u{ffff}") is "FFFF"
PASS codeUnits("\u{10000}") is "D800,DC00"
PASS codeUnits("\u{10001}") is "D800,DC01"
PASS codeUnits("\u{102C0}") is "D800,DEC0"
PASS codeUnits("\u{102c0}") is "D800,DEC0"
PASS codeUnits("\u{1D306}") is "D834,DF06"
PASS codeUnits("\u{1d306}") is "D834,DF06"
PASS codeUnits("\u{10FFFE}") is "DBFF,DFFE"
PASS codeUnits("\u{10fffe}") is "DBFF,DFFE"
PASS codeUnits("\u{10FFFF}") is "DBFF,DFFF"
PASS codeUnits("\u{10ffff}") is "DBFF,DFFF"
PASS codeUnits("\u{00000000000000000000000010FFFF}") is "DBFF,DFFF"
PASS codeUnits("\u{00000000000000000000000010ffff}") is "DBFF,DFFF"
PASS codeUnits("\u") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
[ES6] Implement Unicode code point escapes https://bugs.webkit.org/show_bug.cgi?id=144377 Reviewed by Antti Koivisto. Source/JavaScriptCore: * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from the header. Made it a non-member class so it doesn't need to be part of a template. Made it use UChar32 instead of int for the value to make it clearer what goes into this class. (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the old type() function. (JSC::Lexer<CharacterType>::parseUnicodeEscape): Renamed from parseFourDigitUnicodeHex and added support for code point escapes. (JSC::isLatin1): Added an overload for UChar32. (JSC::isIdentStart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity. Also added FIXME about a subtle ES6 change that we might want to make later. (JSC::isIdentPart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we don't need to repeat the code twice. Added code to handle code point escapes. (JSC::isIdentPartIncludingEscape): Call the template instead of having the code in line. (JSC::Lexer<CharacterType>::recordUnicodeCodePoint): Added. (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): Made small tweaks and updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex. (JSC::Lexer<CharacterType>::parseComplexEscape): Call parseUnicodeEscape instead of parseFourDigitUnicodeHex. Move the code to handle "\u" before the code that handles the escapes, since the code point escape code now consumes characters while parsing rather than peeking ahead. Test case covers this: Symptom would be that "\u{" would evaluate to "u" instead of giving a syntax error. * parser/Lexer.h: Updated for above changes. * runtime/StringConstructor.cpp: (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Source/WebCore: Test: js/unicode-escape-sequences.html * css/CSSParser.cpp: (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of writing out 0xFFFD. * html/parser/HTMLEntityParser.cpp: (WebCore::isAlphaNumeric): Deleted. (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior, but maye it's something we want to do in the future. (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead of a the function in this file that does the same thing less efficiently. * html/parser/InputStreamPreprocessor.h: (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use replacementCharacter from CharacterNames.h instead of writing out 0xFFFd. * xml/parser/CharacterReferenceParserInlines.h: (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of defining our own local highestValidCharacter constant. LayoutTests: * js/script-tests/unicode-escape-sequences.js: Added. * js/unicode-escape-sequences-expected.txt: Added. * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers. Canonical link: https://commits.webkit.org/162396@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@183552 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-04-29 16:33:12 +00:00
PASS codeUnits("\ux") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{}") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{G}") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{1G}") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{110000}") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{1000000}") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits("\u{100000000000000000000000}") threw exception SyntaxError: \u can only be followed by a Unicode character sequence.
PASS codeUnits(function \u{41}(){}.name) is "0041"
PASS codeUnits(function \u{102C0}(){}.name) is "D800,DEC0"
PASS codeUnits(function \u{102c0}(){}.name) is "D800,DEC0"
JavaScript identifier grammar supports unescaped astral symbols, but JSC doesn’t https://bugs.webkit.org/show_bug.cgi?id=208998 Reviewed by Michael Saboff. JSTests: * stress/unicode-identifiers-with-surrogate-pairs.js: Added. (let.c.of.chars.eval.foo): (throwsSyntaxError): (let.c.of.continueChars.throwsSyntaxError.foo): Source/JavaScriptCore: This patch fixes a bug in the parser that allows for surrogate pairs when parsing identifiers. It also makes a few other changes to the parser: 1) When looking for keywords we just need to check that subsequent character cannot be a identifier part or an escape start. 2) The only time we call parseIdentifierSlowCase is when we hit an escape start or a surrogate pair so we can optimize that to just copy everything up slow character into our buffer. 3) We shouldn't allow for asking if a UChar is an identifier start/part. * KeywordLookupGenerator.py: (Trie.printSubTreeAsC): (Trie.printAsC): * parser/Lexer.cpp: (JSC::isNonLatin1IdentStart): (JSC::isIdentStart): (JSC::isSingleCharacterIdentStart): (JSC::cannotBeIdentStart): (JSC::isIdentPart): (JSC::isSingleCharacterIdentPart): (JSC::cannotBeIdentPartOrEscapeStart): (JSC::Lexer<LChar>::currentCodePoint const): (JSC::Lexer<UChar>::currentCodePoint const): (JSC::Lexer<LChar>::parseIdentifier): (JSC::Lexer<UChar>::parseIdentifier): (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): (JSC::Lexer<T>::lexWithoutClearingLineTerminator): (JSC::Lexer<T>::scanRegExp): (JSC::isIdentPartIncludingEscapeTemplate): Deleted. (JSC::isIdentPartIncludingEscape): Deleted. * parser/Lexer.h: (JSC::Lexer::setOffsetFromSourcePtr): Deleted. * parser/Parser.cpp: (JSC::Parser<LexerType>::printUnexpectedTokenText): * parser/ParserTokens.h: Source/WTF: * wtf/text/WTFString.cpp: (WTF::String::fromCodePoint): * wtf/text/WTFString.h: LayoutTests: Fix broken test that asserted a non-ID_START codepoint was a start codepoint and an ID_START codepoint was not a valid codepoint... * js/script-tests/unicode-escape-sequences.js: * js/unicode-escape-sequences-expected.txt: Canonical link: https://commits.webkit.org/222071@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@258531 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2020-03-17 00:12:17 +00:00
PASS codeUnits(function \u{10000}(){}.name) is "D800,DC00"
PASS codeUnits(function \u{10001}(){}.name) is "D800,DC01"
[ES6] Implement Unicode code point escapes https://bugs.webkit.org/show_bug.cgi?id=144377 Reviewed by Antti Koivisto. Source/JavaScriptCore: * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from the header. Made it a non-member class so it doesn't need to be part of a template. Made it use UChar32 instead of int for the value to make it clearer what goes into this class. (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the old type() function. (JSC::Lexer<CharacterType>::parseUnicodeEscape): Renamed from parseFourDigitUnicodeHex and added support for code point escapes. (JSC::isLatin1): Added an overload for UChar32. (JSC::isIdentStart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity. Also added FIXME about a subtle ES6 change that we might want to make later. (JSC::isIdentPart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we don't need to repeat the code twice. Added code to handle code point escapes. (JSC::isIdentPartIncludingEscape): Call the template instead of having the code in line. (JSC::Lexer<CharacterType>::recordUnicodeCodePoint): Added. (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): Made small tweaks and updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex. (JSC::Lexer<CharacterType>::parseComplexEscape): Call parseUnicodeEscape instead of parseFourDigitUnicodeHex. Move the code to handle "\u" before the code that handles the escapes, since the code point escape code now consumes characters while parsing rather than peeking ahead. Test case covers this: Symptom would be that "\u{" would evaluate to "u" instead of giving a syntax error. * parser/Lexer.h: Updated for above changes. * runtime/StringConstructor.cpp: (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Source/WebCore: Test: js/unicode-escape-sequences.html * css/CSSParser.cpp: (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of writing out 0xFFFD. * html/parser/HTMLEntityParser.cpp: (WebCore::isAlphaNumeric): Deleted. (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior, but maye it's something we want to do in the future. (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead of a the function in this file that does the same thing less efficiently. * html/parser/InputStreamPreprocessor.h: (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use replacementCharacter from CharacterNames.h instead of writing out 0xFFFd. * xml/parser/CharacterReferenceParserInlines.h: (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of defining our own local highestValidCharacter constant. LayoutTests: * js/script-tests/unicode-escape-sequences.js: Added. * js/unicode-escape-sequences-expected.txt: Added. * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers. Canonical link: https://commits.webkit.org/162396@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@183552 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-04-29 16:33:12 +00:00
PASS codeUnits(function \u(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u'.
PASS codeUnits(function \u{0}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{0}'.
PASS codeUnits(function \u{D800}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{D800}'.
PASS codeUnits(function \u{d800}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{d800}'.
PASS codeUnits(function \u{DC00}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{DC00}'.
PASS codeUnits(function \u{dc00}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{dc00}'.
PASS codeUnits(function \u{FFFF}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{FFFF}'.
PASS codeUnits(function \u{ffff}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{ffff}'.
PASS codeUnits(function \u{10FFFE}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10FFFE}'.
PASS codeUnits(function \u{10fffe}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10fffe}'.
PASS codeUnits(function \u{10FFFF}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10FFFF}'.
PASS codeUnits(function \u{10ffff}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{10ffff}'.
PASS codeUnits(function \u{00000000000000000000000010FFFF}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{00000000000000000000000010FFFF}'.
PASS codeUnits(function \u{00000000000000000000000010ffff}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{00000000000000000000000010ffff}'.
JavaScript identifier grammar supports unescaped astral symbols, but JSC doesn’t https://bugs.webkit.org/show_bug.cgi?id=208998 Reviewed by Michael Saboff. JSTests: * stress/unicode-identifiers-with-surrogate-pairs.js: Added. (let.c.of.chars.eval.foo): (throwsSyntaxError): (let.c.of.continueChars.throwsSyntaxError.foo): Source/JavaScriptCore: This patch fixes a bug in the parser that allows for surrogate pairs when parsing identifiers. It also makes a few other changes to the parser: 1) When looking for keywords we just need to check that subsequent character cannot be a identifier part or an escape start. 2) The only time we call parseIdentifierSlowCase is when we hit an escape start or a surrogate pair so we can optimize that to just copy everything up slow character into our buffer. 3) We shouldn't allow for asking if a UChar is an identifier start/part. * KeywordLookupGenerator.py: (Trie.printSubTreeAsC): (Trie.printAsC): * parser/Lexer.cpp: (JSC::isNonLatin1IdentStart): (JSC::isIdentStart): (JSC::isSingleCharacterIdentStart): (JSC::cannotBeIdentStart): (JSC::isIdentPart): (JSC::isSingleCharacterIdentPart): (JSC::cannotBeIdentPartOrEscapeStart): (JSC::Lexer<LChar>::currentCodePoint const): (JSC::Lexer<UChar>::currentCodePoint const): (JSC::Lexer<LChar>::parseIdentifier): (JSC::Lexer<UChar>::parseIdentifier): (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): (JSC::Lexer<T>::lexWithoutClearingLineTerminator): (JSC::Lexer<T>::scanRegExp): (JSC::isIdentPartIncludingEscapeTemplate): Deleted. (JSC::isIdentPartIncludingEscape): Deleted. * parser/Lexer.h: (JSC::Lexer::setOffsetFromSourcePtr): Deleted. * parser/Parser.cpp: (JSC::Parser<LexerType>::printUnexpectedTokenText): * parser/ParserTokens.h: Source/WTF: * wtf/text/WTFString.cpp: (WTF::String::fromCodePoint): * wtf/text/WTFString.h: LayoutTests: Fix broken test that asserted a non-ID_START codepoint was a start codepoint and an ID_START codepoint was not a valid codepoint... * js/script-tests/unicode-escape-sequences.js: * js/unicode-escape-sequences-expected.txt: Canonical link: https://commits.webkit.org/222071@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@258531 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2020-03-17 00:12:17 +00:00
PASS codeUnits(function \u{1D306}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{1D306}'.
PASS codeUnits(function \u{1d306}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{1d306}'.
[ES6] Implement Unicode code point escapes https://bugs.webkit.org/show_bug.cgi?id=144377 Reviewed by Antti Koivisto. Source/JavaScriptCore: * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from the header. Made it a non-member class so it doesn't need to be part of a template. Made it use UChar32 instead of int for the value to make it clearer what goes into this class. (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the old type() function. (JSC::Lexer<CharacterType>::parseUnicodeEscape): Renamed from parseFourDigitUnicodeHex and added support for code point escapes. (JSC::isLatin1): Added an overload for UChar32. (JSC::isIdentStart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity. Also added FIXME about a subtle ES6 change that we might want to make later. (JSC::isIdentPart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we don't need to repeat the code twice. Added code to handle code point escapes. (JSC::isIdentPartIncludingEscape): Call the template instead of having the code in line. (JSC::Lexer<CharacterType>::recordUnicodeCodePoint): Added. (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): Made small tweaks and updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex. (JSC::Lexer<CharacterType>::parseComplexEscape): Call parseUnicodeEscape instead of parseFourDigitUnicodeHex. Move the code to handle "\u" before the code that handles the escapes, since the code point escape code now consumes characters while parsing rather than peeking ahead. Test case covers this: Symptom would be that "\u{" would evaluate to "u" instead of giving a syntax error. * parser/Lexer.h: Updated for above changes. * runtime/StringConstructor.cpp: (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Source/WebCore: Test: js/unicode-escape-sequences.html * css/CSSParser.cpp: (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of writing out 0xFFFD. * html/parser/HTMLEntityParser.cpp: (WebCore::isAlphaNumeric): Deleted. (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior, but maye it's something we want to do in the future. (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead of a the function in this file that does the same thing less efficiently. * html/parser/InputStreamPreprocessor.h: (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use replacementCharacter from CharacterNames.h instead of writing out 0xFFFd. * xml/parser/CharacterReferenceParserInlines.h: (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of defining our own local highestValidCharacter constant. LayoutTests: * js/script-tests/unicode-escape-sequences.js: Added. * js/unicode-escape-sequences-expected.txt: Added. * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers. Canonical link: https://commits.webkit.org/162396@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@183552 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-04-29 16:33:12 +00:00
PASS codeUnits(function \ux(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u'.
PASS codeUnits(function \u{(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{'.
PASS codeUnits(function \u{}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{'.
PASS codeUnits(function \u{G}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{'.
PASS codeUnits(function \u{1G}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{1'.
Lift template escape sequence restrictions in tagged templates https://bugs.webkit.org/show_bug.cgi?id=166871 Reviewed by Saam Barati. JSTests: Update the error messages and add new tests. * ChakraCore/test/es6/unicode_6_identifier_Blue524737.baseline-jsc: * stress/lift-template-literal.js: Added. (dump): (testTag.return.tag): (testTag): * stress/template-literal-syntax.js: Source/JavaScriptCore: This patch implements stage 3 Lifting Template Literal Restriction[1]. Prior to this patch, template literal becomes syntax error if it contains invalid escape sequences. But it is too restricted; Template literal can have cooked and raw representations and only cooked representation can escape sequences. So even if invalid escape sequences are included, the raw representation can be valid. Lifting Template Literal Restriction relaxes the above restriction. When invalid escape sequence is included, if target template literals are used as tagged templates, we make the result of the template including the invalid escape sequence `undefined` instead of making it SyntaxError immediately. It allows us to accept the templates including invalid escape sequences in the raw representations in tagged templates. On the other hand, the raw representation is only used in tagged templates. So if invalid escape sequences are included in the usual template literals, we just make it SyntaxError as before. [1]: https://github.com/tc39/proposal-template-literal-revision * bytecompiler/BytecodeGenerator.cpp: (JSC::BytecodeGenerator::emitGetTemplateObject): * bytecompiler/NodesCodegen.cpp: (JSC::TemplateStringNode::emitBytecode): (JSC::TemplateLiteralNode::emitBytecode): * parser/ASTBuilder.h: (JSC::ASTBuilder::createTemplateString): * parser/Lexer.cpp: (JSC::Lexer<CharacterType>::parseUnicodeEscape): (JSC::Lexer<T>::parseTemplateLiteral): (JSC::Lexer<T>::lex): (JSC::Lexer<T>::scanTemplateString): (JSC::Lexer<T>::scanTrailingTemplateString): Deleted. * parser/Lexer.h: * parser/NodeConstructors.h: (JSC::TemplateStringNode::TemplateStringNode): * parser/Nodes.h: (JSC::TemplateStringNode::cooked): (JSC::TemplateStringNode::raw): * parser/Parser.cpp: (JSC::Parser<LexerType>::parseAssignmentElement): (JSC::Parser<LexerType>::parseTemplateString): (JSC::Parser<LexerType>::parseTemplateLiteral): (JSC::Parser<LexerType>::parsePrimaryExpression): (JSC::Parser<LexerType>::parseMemberExpression): * parser/ParserTokens.h: * parser/SyntaxChecker.h: (JSC::SyntaxChecker::createTemplateString): * runtime/TemplateRegistry.cpp: (JSC::TemplateRegistry::getTemplateObject): * runtime/TemplateRegistryKey.h: (JSC::TemplateRegistryKey::cookedStrings): (JSC::TemplateRegistryKey::create): (JSC::TemplateRegistryKey::TemplateRegistryKey): * runtime/TemplateRegistryKeyTable.cpp: (JSC::TemplateRegistryKeyTable::createKey): * runtime/TemplateRegistryKeyTable.h: LayoutTests: Update the error messages. * inspector/runtime/parse-expected.txt: * js/unicode-escape-sequences-expected.txt: Canonical link: https://commits.webkit.org/184568@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@211319 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-01-28 03:09:12 +00:00
PASS codeUnits(function \u{110000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{110000'.
PASS codeUnits(function \u{1000000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{1000000'.
PASS codeUnits(function \u{100000000000000000000000}(){}.name) threw exception SyntaxError: Invalid unicode escape in identifier: '\u{100000000000000000000000'.
[ES6] Implement Unicode code point escapes https://bugs.webkit.org/show_bug.cgi?id=144377 Reviewed by Antti Koivisto. Source/JavaScriptCore: * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from the header. Made it a non-member class so it doesn't need to be part of a template. Made it use UChar32 instead of int for the value to make it clearer what goes into this class. (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the old type() function. (JSC::Lexer<CharacterType>::parseUnicodeEscape): Renamed from parseFourDigitUnicodeHex and added support for code point escapes. (JSC::isLatin1): Added an overload for UChar32. (JSC::isIdentStart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity. Also added FIXME about a subtle ES6 change that we might want to make later. (JSC::isIdentPart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we don't need to repeat the code twice. Added code to handle code point escapes. (JSC::isIdentPartIncludingEscape): Call the template instead of having the code in line. (JSC::Lexer<CharacterType>::recordUnicodeCodePoint): Added. (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): Made small tweaks and updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex. (JSC::Lexer<CharacterType>::parseComplexEscape): Call parseUnicodeEscape instead of parseFourDigitUnicodeHex. Move the code to handle "\u" before the code that handles the escapes, since the code point escape code now consumes characters while parsing rather than peeking ahead. Test case covers this: Symptom would be that "\u{" would evaluate to "u" instead of giving a syntax error. * parser/Lexer.h: Updated for above changes. * runtime/StringConstructor.cpp: (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Source/WebCore: Test: js/unicode-escape-sequences.html * css/CSSParser.cpp: (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of writing out 0xFFFD. * html/parser/HTMLEntityParser.cpp: (WebCore::isAlphaNumeric): Deleted. (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior, but maye it's something we want to do in the future. (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead of a the function in this file that does the same thing less efficiently. * html/parser/InputStreamPreprocessor.h: (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use replacementCharacter from CharacterNames.h instead of writing out 0xFFFd. * xml/parser/CharacterReferenceParserInlines.h: (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of defining our own local highestValidCharacter constant. LayoutTests: * js/script-tests/unicode-escape-sequences.js: Added. * js/unicode-escape-sequences-expected.txt: Added. * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers. Canonical link: https://commits.webkit.org/162396@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@183552 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-04-29 16:33:12 +00:00
PASS codeUnits(function x\u{41}(){}.name.substring(1)) is "0041"
PASS codeUnits(function x\u{10000}(){}.name.substring(1)) is "D800,DC00"
PASS codeUnits(function x\u{10001}(){}.name.substring(1)) is "D800,DC01"
PASS codeUnits(function x\u{102C0}(){}.name.substring(1)) is "D800,DEC0"
PASS codeUnits(function x\u{102c0}(){}.name.substring(1)) is "D800,DEC0"
PASS codeUnits(function x\u(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u'.
PASS codeUnits(function x\u{0}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{0}'.
PASS codeUnits(function x\u{D800}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{D800}'.
PASS codeUnits(function x\u{d800}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{d800}'.
PASS codeUnits(function x\u{DC00}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{DC00}'.
PASS codeUnits(function x\u{dc00}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{dc00}'.
PASS codeUnits(function x\u{FFFF}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{FFFF}'.
PASS codeUnits(function x\u{ffff}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{ffff}'.
PASS codeUnits(function x\u{1D306}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1D306}'.
PASS codeUnits(function x\u{1d306}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1d306}'.
PASS codeUnits(function x\u{10FFFE}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10FFFE}'.
PASS codeUnits(function x\u{10fffe}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10fffe}'.
PASS codeUnits(function x\u{10FFFF}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10FFFF}'.
PASS codeUnits(function x\u{10ffff}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{10ffff}'.
PASS codeUnits(function x\u{00000000000000000000000010FFFF}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{00000000000000000000000010FFFF}'.
PASS codeUnits(function x\u{00000000000000000000000010ffff}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{00000000000000000000000010ffff}'.
PASS codeUnits(function x\ux(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u'.
PASS codeUnits(function x\u{(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{'.
PASS codeUnits(function x\u{}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{'.
PASS codeUnits(function x\u{G}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{'.
PASS codeUnits(function x\u{1G}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1'.
Lift template escape sequence restrictions in tagged templates https://bugs.webkit.org/show_bug.cgi?id=166871 Reviewed by Saam Barati. JSTests: Update the error messages and add new tests. * ChakraCore/test/es6/unicode_6_identifier_Blue524737.baseline-jsc: * stress/lift-template-literal.js: Added. (dump): (testTag.return.tag): (testTag): * stress/template-literal-syntax.js: Source/JavaScriptCore: This patch implements stage 3 Lifting Template Literal Restriction[1]. Prior to this patch, template literal becomes syntax error if it contains invalid escape sequences. But it is too restricted; Template literal can have cooked and raw representations and only cooked representation can escape sequences. So even if invalid escape sequences are included, the raw representation can be valid. Lifting Template Literal Restriction relaxes the above restriction. When invalid escape sequence is included, if target template literals are used as tagged templates, we make the result of the template including the invalid escape sequence `undefined` instead of making it SyntaxError immediately. It allows us to accept the templates including invalid escape sequences in the raw representations in tagged templates. On the other hand, the raw representation is only used in tagged templates. So if invalid escape sequences are included in the usual template literals, we just make it SyntaxError as before. [1]: https://github.com/tc39/proposal-template-literal-revision * bytecompiler/BytecodeGenerator.cpp: (JSC::BytecodeGenerator::emitGetTemplateObject): * bytecompiler/NodesCodegen.cpp: (JSC::TemplateStringNode::emitBytecode): (JSC::TemplateLiteralNode::emitBytecode): * parser/ASTBuilder.h: (JSC::ASTBuilder::createTemplateString): * parser/Lexer.cpp: (JSC::Lexer<CharacterType>::parseUnicodeEscape): (JSC::Lexer<T>::parseTemplateLiteral): (JSC::Lexer<T>::lex): (JSC::Lexer<T>::scanTemplateString): (JSC::Lexer<T>::scanTrailingTemplateString): Deleted. * parser/Lexer.h: * parser/NodeConstructors.h: (JSC::TemplateStringNode::TemplateStringNode): * parser/Nodes.h: (JSC::TemplateStringNode::cooked): (JSC::TemplateStringNode::raw): * parser/Parser.cpp: (JSC::Parser<LexerType>::parseAssignmentElement): (JSC::Parser<LexerType>::parseTemplateString): (JSC::Parser<LexerType>::parseTemplateLiteral): (JSC::Parser<LexerType>::parsePrimaryExpression): (JSC::Parser<LexerType>::parseMemberExpression): * parser/ParserTokens.h: * parser/SyntaxChecker.h: (JSC::SyntaxChecker::createTemplateString): * runtime/TemplateRegistry.cpp: (JSC::TemplateRegistry::getTemplateObject): * runtime/TemplateRegistryKey.h: (JSC::TemplateRegistryKey::cookedStrings): (JSC::TemplateRegistryKey::create): (JSC::TemplateRegistryKey::TemplateRegistryKey): * runtime/TemplateRegistryKeyTable.cpp: (JSC::TemplateRegistryKeyTable::createKey): * runtime/TemplateRegistryKeyTable.h: LayoutTests: Update the error messages. * inspector/runtime/parse-expected.txt: * js/unicode-escape-sequences-expected.txt: Canonical link: https://commits.webkit.org/184568@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@211319 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2017-01-28 03:09:12 +00:00
PASS codeUnits(function x\u{110000}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{110000'.
PASS codeUnits(function x\u{1000000}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{1000000'.
PASS codeUnits(function x\u{100000000000000000000000}(){}.name.substring(1)) threw exception SyntaxError: Invalid unicode escape in identifier: 'x\u{100000000000000000000000'.
[ES6] Implement Unicode code point escapes https://bugs.webkit.org/show_bug.cgi?id=144377 Reviewed by Antti Koivisto. Source/JavaScriptCore: * parser/Lexer.cpp: Moved the UnicodeHexValue class in here from the header. Made it a non-member class so it doesn't need to be part of a template. Made it use UChar32 instead of int for the value to make it clearer what goes into this class. (JSC::ParsedUnicodeEscapeValue::isIncomplete): Added. Replaces the old type() function. (JSC::Lexer<CharacterType>::parseUnicodeEscape): Renamed from parseFourDigitUnicodeHex and added support for code point escapes. (JSC::isLatin1): Added an overload for UChar32. (JSC::isIdentStart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isNonLatin1IdentPart): Changed argument type to UChar32 for clarity. Also added FIXME about a subtle ES6 change that we might want to make later. (JSC::isIdentPart): Changed this to take UChar32; no caller tries to call it with a UChar, so no need to overload for that type for now. (JSC::isIdentPartIncludingEscapeTemplate): Made this a template so that we don't need to repeat the code twice. Added code to handle code point escapes. (JSC::isIdentPartIncludingEscape): Call the template instead of having the code in line. (JSC::Lexer<CharacterType>::recordUnicodeCodePoint): Added. (JSC::Lexer<CharacterType>::parseIdentifierSlowCase): Made small tweaks and updated to call parseUnicodeEscape instead of parseFourDigitUnicodeHex. (JSC::Lexer<CharacterType>::parseComplexEscape): Call parseUnicodeEscape instead of parseFourDigitUnicodeHex. Move the code to handle "\u" before the code that handles the escapes, since the code point escape code now consumes characters while parsing rather than peeking ahead. Test case covers this: Symptom would be that "\u{" would evaluate to "u" instead of giving a syntax error. * parser/Lexer.h: Updated for above changes. * runtime/StringConstructor.cpp: (JSC::stringFromCodePoint): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Source/WebCore: Test: js/unicode-escape-sequences.html * css/CSSParser.cpp: (WebCore::CSSParser::parseEscape): Use ICU's UCHAR_MAX_VALUE instead of writing out 0x10FFFF; clearer this way. Also use our replacementCharacter instead of writing out 0xFFFD. * html/parser/HTMLEntityParser.cpp: (WebCore::isAlphaNumeric): Deleted. (WebCore::HTMLEntityParser::legalEntityFor): Use ICU's UCHAR_MAX_VALUE and U_IS_SURROGATE instead of writing the code out. Didn't use U_IS_UNICODE_CHAR because that also includes U_IS_UNICODE_NONCHAR and thus would change behavior, but maye it's something we want to do in the future. (WebCore::HTMLEntityParser::consumeNamedEntity): Use isASCIIAlphanumeric instead of a the function in this file that does the same thing less efficiently. * html/parser/InputStreamPreprocessor.h: (WebCore::InputStreamPreprocessor::processNextInputCharacter): Use replacementCharacter from CharacterNames.h instead of writing out 0xFFFd. * xml/parser/CharacterReferenceParserInlines.h: (WebCore::consumeCharacterReference): Use ICU's UCHAR_MAX_VALUE instead of defining our own local highestValidCharacter constant. LayoutTests: * js/script-tests/unicode-escape-sequences.js: Added. * js/unicode-escape-sequences-expected.txt: Added. * js/unicode-escape-sequences.html: Added. Generated with make-script-test-wrappers. Canonical link: https://commits.webkit.org/162396@main git-svn-id: https://svn.webkit.org/repository/webkit/trunk@183552 268f45cc-cd09-0410-ab3c-d52691b4dbfc
2015-04-29 16:33:12 +00:00
PASS successfullyParsed is true
TEST COMPLETE