Inside ECMAScript: JavaScript Standard Gets an Extra Stage
It’s been nearly a decade since JavaScript started getting new updates to the language specification — a process that had stalled for many years, leading to a plethora of frameworks and libraries and some very useful experimentation, but also frustration about the future direction of the language. ECMAScript 2015 put an end to that uncertainty by delivering not just the largest-ever update to the language, but also a reliable process for annual updates that have brought a succession of improvements, large and small.
Getting that process in place took a Herculean effort from the TC39 committee that handles ECMAScript standardization. Everything from converting the standards documentation from a single Word document that had been edited year after year into a modern repository — complete with a custom HTML dialect — to deciding on a multistage process that would ensure new language features were thoroughly designed, examined, tested and implemented. The goal was to ensure everyone in the language and browser community was confident that they would work, improve the language and not introduce incompatibilities for developers to have to clean up.
Stage by Stage for Quality and Compatibility
Suggestions for useful new features in JavaScript start as Stage 0 proposals to explore the idea and think about the problems those features could solve. It’s not until they have both a repo with a clear explanation of why they’re useful, that includes what the potential issues are, and a “champion” who will drive the work required for the feature, that they can move to Stage 1 as a proposal that the committee intends to examine.
Stage 2 takes that initial design further and represents an initial draft of the specification. Moving a proposal to Stage 2 means the committee expects the feature to become part of the language — but it’s not guaranteed, because working through the details, writing the spec and delivering an initial implementation like a polyfill might reveal problems that can’t be easily dealt with.
“… incremental milestones, intermediate milestones that recognise progress while also getting things done”.
– Daniel Ehrenberg, Ecma vice president
Originally, Stage 3 represented a candidate proposal that was almost complete, but needed the kind of feedback you can only get from the real-world experience of implementing it in a browser, a server-side runtime, or tools like Babel and TypeScript where developers can try it out.
That effectively meant delivering a complete set of tests for the test262 ECMAScript Conformance Test Suite so that at least two compatible implementations could complete acceptance testing. While getting to Stage 3 didn’t formally require test262 tests, Chrome won’t implement a feature without them. So reaching Stage 3 usually, but not always, meant that a feature could not only be implemented but also validated.
Reaching Stage 4 means the committee agrees that all the work on the specification has been done and it’s been approved by all the ECMAScript editors and is ready to be included in the overall language specification.
“Stage 4 is an acknowledgment that we did the work during Stage 3,” Ecma vice president Daniel Ehrenberg told The New Stack. The whole process is designed to allow “gradual consensus-building and gradual development” rather than waiting until the proposal is complete enough to be approved and shipped, he explained, describing the stages as “incremental milestones, intermediate milestones that recognize progress while also getting things done.”
Making Tests an Explicit Milestone
While the process has worked well, some large proposals have ended up needing more work than expected because the design has needed to be reworked — in some cases so significantly that the proposal has gone back from Stage 3 to Stage 2. (This happened with the Import Attributes proposal that’s part of the module harmony suite of proposals.) It’s not a bad thing: It means the JavaScript community is taking the time to get the feature right and make sure it’s useful to developers.
But with the original process, dropping back to Stage 2 also meant rewriting any tests that had been written to get back to Stage 3, and writing tests is a substantial amount of work to have to do twice — making moving between stages more painful than it’s intended to be.
“Stage 3 is the point at which we say: We believe the design is complete, we believe the specification is complete …”
– Rob Palmer, TC39 co-chair
“There was this awkwardness as we got to Stage 3, which […] particularly for large proposals, it takes a lot of effort to write the tests,” explained TC39 co-chair Rob Palmer.
“Stage 3 is the point at which we say, ‘We believe the design is complete, we believe the specification is complete and in order to learn more, we must implement [the feature]; it’s time to do that.’ Implementers would really love to have all the tests ready at that point. But if you’re working on a specification and you’re trying to resolve the last design point and you’re trying to achieve Stage 3, even though the implementers would love to have the tests, sometimes, because of the effort, you’re not ready to make that big investment. Because if you get design feedback, if the design is not stable, a lot of those tests might be invalidated, and then that’s wasted effort.”
Stage 3 is intended to act as a signal that a proposal is ready to be implemented — not because it’s finished but because it’s mature enough to justify more intense experimentation, and because the design of the proposal is as final as it can be without any feedback from implementations and tests. But combining experimentation, implementation and validation in the same phase of the process can end up slowing down progress, leaving a proposal stalled while tests are rewritten, even when decisions to change the design have been agreed upon. Just figuring out which tests need to be rewritten to account for design changes is a lot of work.
A proposal that reaches Stage 2.7 is approved “in principle” but needs validating.
The repeat work falls not just on the people behind the feature proposal, but also on the lesser-known test262 committee, Palmer noted. “The testing effort can be huge, and a lot of that is done by people who are not always being paid to do this work. Some people maintain test262 because they want to do what’s right for JavaScript and improve the quality [of language features].”
To separate the testing phase from implementation — especially in browsers and JavaScript engines — at the end of 2023, TC39 added a new stage between Stage 2 and Stage 3: Stage 2.7. A proposal that reaches Stage 2.7 is approved “in principle” but needs validating: by developing a full test suite and prototypes and by getting enough experience to show that it can be implemented. At Stage 2.7, the text for the feature specification is complete and the TC39 committee won’t ask for any changes except for what comes up through testing, implementing and using the feature.
The numbering was carefully chosen to indicate that a project is almost, but not quite, at Stage 3. (There was a suggestion to name it after the mathematical constant e, which has a value of 2.7-something; while charming, that might also have been confusing and hard to explain, but it may have helped the committee pick 2.7 instead of some other intermediate number.)
“2.7 actually has pretty much the same requirements that we used to have for Stage 3,” Palmer told us. “That means Stage 3 is now actually a stronger stage because it means everything that Stage 3 used to mean, but it also means ‘and we have a test suite’ … so when the implementers begin their implementation, they’ve got something to work with. That’s a good signal to them that this proposal is really ready to get going.”
Stage 3 now becomes about gaining implementation experience and discovering any web compatibility or integration issues with the feature. There won’t be more changes to the feature spec unless there’s a problem.
That’s an important point, because when a proposal had to change its design and rewrite tests before getting to Stage 3, it would sometimes lead to a more wide-ranging discussion than expected, including raising points that had already been decided on, just because the project needed the committee to approve changes to the tests.
Tests Don’t Block Progress
Making testing a more explicit requirement for Stage 3 isn’t as straightforward a change as it seems, because building a new language feature isn’t a purely sequential process where you come up with a design, then write the specification once the design is right, and write tests once the specification is finished — and only then start implementing.
Writing tests is often the best way of thinking through all the consequences of how the feature is designed and how it needs to be specified.
One of the reasons why writing tests is so much work is that writing tests is often the best way of thinking through all the consequences of how the feature is designed and how it needs to be specified. You don’t need an implementation to consider the options and implications. But once you’ve written the tests, you do need an implementation to run them against so that you can validate whether tests are actually correctly written. That means a proposal will probably go through a few of the steps more than once.
As well as the early implementations in polyfills created by the people proposing the feature, some implementers like Igalia and transpilers like Babel and TypeScript don’t usually wait for Stage 3. While the browsers and JavaScript engines need to wait until there is a test suite because the implementation bar is higher for them, some tests can be written against prototype implementations like polyfills and transpiled code. Searching through the JavaScript code used on websites, using counters, or putting an early implementation behind a feature flag in canary builds of web browsers can all help with checking for web compatibility problems (like the clashes that meant rewriting the Map.groupBy and Object.groupBy methods introduced in ECMAScript 2024 twice) earlier on.
Similarly, having the test262 test suite doesn’t mean a proposal will automatically advance to Stage 3. The committee might decide it needs more information beyond the tests (like that web compatibility data) to show that there’s enough experience to prove the feature can actually be implemented.
Introducing Stage 2.7 won’t necessarily mean that fewer proposals ever go back a stage. That will still happen if the project team needs to take a step back and reconsider earlier decisions. It does make Stage 3 less risky for implementors because they always have the test suite, but it can’t remove all the risk of web compatibility issues.
What Stage 2.7 does is reduce the amount of redundant, wasted work in producing a complete test suite that then has to be changed. A proposal that’s confident it won’t need any changes could create the test suite and jump straight from Stage 2 to Stage 3. In fact, the proposal to add a built-in mechanism for converting binary data to and from Base64, Uint8Array to and from Base64, recently moved straight to Stage 3 even though the tests needed a very small update to match the committee’s final choice of design.
Stage 2.7 in Practice
When the TC39 committee added Stage 2.7, they looked at all the existing Stage 3 proposals, and while a couple of projects hadn’t checked in all their tests, they were all close enough to ready not to need demoting (although it’s possible the long-delayed Temporal proposal to replace the JavaScript date object might move back to Stage 2.7 while it’s being cut down slightly in order to avoid bloating browsers on devices with limited resources, like smartwatches).
Several proposals have already advanced to Stage 2.7, like the deferred import proposal and the Math.sumPrecisemethod.
Since it was introduced, several proposals have already advanced to Stage 2.7, like the deferred import proposal (another part of the large module harmony suite) and the Math.sumPrecisemethod for summing a list of values in JavaScript without the usual floating point errors. This is a straightforward proposal that has been moving quickly through the TC39 process, now has a set of tests, and is ready to move to Stage 3 once the committee agrees that the tests have enough coverage. The long-awaited support for escaping a string inside a regular expression, Regexp.escape, reached Stage 2.7 and has already moved on to Stage 3 with the arrival of its test suite.
In contrast, the “microwaits” proposal, now called atomics.pause, is something it’s hard to write any useful tests for. It recently moved to Stage 2.7, with most of the discussion covering what notes to put in the specification to guide the JavaScript engines implementing it, and noting that the eventual tests will basically only check that the API exists.
Not all proposals will move from Stage 2.7 to 3 as quickly. ShadowRealm, a long-standing proposal that’s been through a lot of revisions over the 15 years it’s been under consideration, had already been moved back to Stage 2 at the TC39 meeting before Stage 2.7 was introduced. This feature will provide isolated execution environments for running third-party scripts and plug-ins where they can’t access or modify the state of the main environment. (It’s worth noting that this is more about virtualization and avoiding potential conflicts between different pieces of code that might use, say, the same variable or function names, than a security boundary or a sandbox.)
To advance from Stage 2, the project needed to come up with a list of the browser and JavaScript runtime APIs that need to be exposed to ShadowRealm to make it work, along with a set of tests to ensure correct behavior in implementations, and a commitment from at least two implementors that the API list and the tests are the details they need to do their implementations.
Reaching Stage 2.7 is an indication that the design for ShadowRealm is effectively complete, but it won’t be ready to implement and ship until it reaches Stage 3.
Those tests aren’t the full test suite that ShadowRealm will need in order to reach Stage 3, which makes ShadowRealm a good example of why Stage 2.7 is so useful. The proposal was able to advance to Stage 2.7 in February 2024 with the list of APIs, but it still needs explicit support from at least two implementations — and to reach Stage 3, it will need to include WPT tests (the Web Platform Tests cross-browser test suite) for browser integration.
This is a complex proposal that has implications for WHATWG and WinterCG as well as TC39. Reaching Stage 2.7 is an indication that the design for ShadowRealm is effectively complete, but it won’t be ready to implement and ship until it reaches Stage 3 because there are still integration issues to be worked out. And that’s exactly the kind of intermediate progress Stage 2.7 is designed to mark.