Transponders in Ecmascript

Asked

Viewed 433 times

23

What are the allowed characters, or what is allowed, within a name (known as "identifier") in the Ecmascript 6?

Has rules between identifiers and keywords???

  • 2

    Interesting fact: why do you want to compare with ES3 and not ES5.1? And, for the record, today the current version is already ES7.

  • 3

    Look has a very thorough post talking about it here

1 answer

12

Characters allowed

What are the characters allowed within an identifier in ES6?

To begin with, the name of an identifier (IdentifierName, which gives rise to Identifier) is specified in this grammar:

IdentifierName ::
   * IdentifierStart
   * IdentifierName IdentifierPart
IdentifierStart ::
   * UnicodeIDStart
   * $
   * _
   * \ UnicodeEscapeSequence
IdentifierPart ::
   * UnicodeIDContinue
   * $
   * _
   * \ UnicodeEscapeSequence
   * <ZWNJ>
   * <ZWJ>
UnicodeIDStart ::
   * qualquer caractere na categoria “ID_Start”
UnicodeIDContinue ::
   * qualquer caractere na categoria “ID_Continue”

IdentifierName clearly means name that begins with IdentifierStart and continues with 0 or more IdentifierParts. It’s recursive, but we can think like that.

Basically allowed characters are '$' and '_'. Especially also the lexical element UnicodeEscapeSequence, those \uhhhh and \u{hex}.

The elements IdentifierStart and IdentifierPart accept more than one Unicode category, at least in ES3. In ES6 IdentifierStart accepts all category characters Id_start, while IdentifierPart accepts everyone in Id_continue and 2 inclusive characters (U+200C and U+200D).

Tip

Behold, Unicode Utilities: Unicodeset. This site evaluates a Pattern and displays a list of equivalent characters (set).

That Pattern...

[:age=5.1:]&[[:ID_Start:]]

results in the category characters Id_start in version 5.1 of Unicode. To include an abbreviated category such as Pc, wear something like [:gc=Pc:]

Warnings about the cited utility

  • Some characters may not appear in the top result, perhaps because of the browser. To make sure that all characters will appear, turn on the option Escape so that characters not common in ASCII appear correctly. Characters outside the basic plan will appear with \UHHHHHHHH, while the rest will appear normal or in shape \uHHHH. Modify the result as you wish.

  • So far the utility has correctly arranged descending to ascending order in the resulting sets for me, on the subject of identifiers. With categories such as Zs (certain line breaks) it didn’t organize right, so maybe you need to organize manually, or using a tool. It’s not so hard to make a.

Some rules

Rule of UnicodeEscapeSequence

  • A UnicodeEscapeSequence shall contribute a character permitted in the IdentifierStart, as in the IdentifierPart. This is simply because this escape can result in any character.

Reserved identifiers

We all know which keywords (ReservedWord) should appear in the right context. Some of these keywords are in the future reserved (FutureReservedWord).

Any piece of code contained in strict mode contains more keywords in the lexical element FutureReservedWord. See the section § 11.6.2.2 Future Reserved Words.

And in the little section § 12.1.1 Static Semantics: Early Errors there are some handle breakers reserved.

Especially, some interpreters restrict the appearance of UnicodeEscapeSequences within keywords (for example, instanceo\u{66}) and exception. Some interpreters (such as Actionscript 3 used in mxmlc probably) can ignore keywords with UnicodeEscapes, but that means it was poorly implemented.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.