TIL Automatic Semicolon Insertion

POSTED ON: Aug 18, 2021

{ 1
2 } 3

// is transformed by ASI into:

{ 1
;2 ;} 3;

JavaScript’s “automatic semicolon insertion” rule is the odd one. Where other languages assume most newlines are meaningful and only a few should be ignored in multi-line statements, JS assumes the opposite. It treats all of your newlines as meaningless whitespace unless it encounters a parse error. If it does, it goes back and tries turning the previous newline into a semicolon to get something grammatically valid.

This design note would turn into a design diatribe if I went into complete detail about how that even works, much less all the various ways that JavaScript’s “solution” is a bad idea. It’s a mess. JavaScript is the only language I know where many style guides demand explicit semicolons after every statement even though the language theoretically lets you elide them.

Reference - the guide Crafting Interpreters

The return value expression must start on the same line as the return keyword in order to avoid semicolon insertion.

There are three basic rules of semicolon insertion:

When, as the source text is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:

The offending token is separated from the previous token by at least one LineTerminator.

The offending token is }.

The previous token is ) and the inserted semicolon would then be parsed as the terminating semicolon of a do-while statement (14.7.2).

When, as the source text is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single instance of the goal nonterminal, then a semicolon is automatically inserted at the end of the input stream.

When, as the source text is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation “[no LineTerminator here]” within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
From the ECMAScript Specs

The MDN
and the SO question around it:
https://stackoverflow.com/questions/3641519/why-do-results-vary-based-on-curly-brace-placement

Related TILs

Tagged: javascript

TIL what is npm Script

Despite their high usage they are not particularly well optimized and add about 400ms of overhead. In this article we were able to bring that down to ~22ms.

Mar 31, 2023

TIL fancy methods to transform Javascript Objects

You can use Object.entries(), Object.keys(), Object.fromEntries()...

Mar 26, 2023

TIL how to hide your JS code

ONE THING TO NOTE: Encrypting a script is stronger than obfuscation, both methods are still not adequate to protect secret content. Honestly, I don't think it's worth it on a production site, and instead just go with pure server-side if you want security. But it's fascinating.

Mar 25, 2023