The Mechanics underlying the Specification
Automatic semicolon insertion (ASI for short) is with us in ES5 and it will continue accompanying us in ES6, so an understanding of how this mechanism operates is beneficial. This article maps out the operations our parsers are performing when we include and don't include semicolons. The rules for ASI are described in section 7.9 of the ES5.1 standard, and we will examine these rules laid out in this particular version since ES6 is still in draft phase.
The three basic rules for ASI are most accurately stated in section 7.9.1:
1. When, as the program is parsed from left to right, a token (called the offending token) is encountered that is not allowed by any production of the grammar, then a semicolon is automatically inserted before the offending token if one or more of the following conditions is true:
The offending token is separated from the previous token by at least one LineTerminator.
The offending token is }.
2. When, as the program is parsed from left to right, the end of the input stream of tokens is encountered and the parser is unable to parse the input token stream as a single complete ECMAScript Program, then a semicolon is automatically inserted at the end of the input stream.
3. When, as the program is parsed from left to right, a token is encountered that is allowed by some production of the grammar, but the production is a restricted production and the token would be the first token for a terminal or nonterminal immediately following the annotation “[no LineTerminator here]” within the restricted production (and therefore such a token is called a restricted token), and the restricted token is separated from the previous token by at least one LineTerminator, then a semicolon is automatically inserted before the restricted token.
Let's look at the steps an ES compliant parser might take when encountering the following code:
var one = 1 console.log(one)
A given here is that the variable expression has already been parsed. The beginning of the console.log statement meets the first criteria in Rule 1 when the offending token, c, is encountered. As the parser follows the first rule of semicolon insertion and checks to see if the character is preceded by a new line break1, it discovers this is the case, and so it inserts a semicolon after the preceding token, which is 1, and continues forward. Now let's look at how simple it becomes when a semicolon is included:
var one = 1; console.log(one);
It just parses another token, the semicolon, which correctly closes the statement, as the parser intended, and carries onward. The following more complex example should help in furthering our understanding on the very different paths the parser may traverse:
return; a + b;
return a + b
Not only might an error in logic go unnoticed, but leaving out the semicolon clearly isn't the shortest path. At the minimum, the parser has to process three boolean statements and a function call for insertion of a semicolon, which itself is adding to the complexity. In the most extreme case (shown above), you are looking at three more boolean checks for a total of six moves and finally a semicolon insertion function call. Additionally, there's an overriding condition for all the rules:
...a semicolon is never inserted automatically if the semicolon would then be parsed as an empty statement or if that semicolon would become one of the two semicolons in the header of a for statement (see 12.6.3).
I hope this provides a solid reference on the underlying machinery behind ASI.
1 Named the line feed character in table 3 of chapter 7, section 3.