catchjs

We rendered a million web pages to learn how the web breaks

As part of our JavaScript Errors and Performance In The Wild study, we scripted a web browser to render the root page of the top 1 million domains on the web. At the same time, we kept track of any error caught by window.onerror. Usually we track errors for our customers, this time we tracked errors for the entire web. This produces something that to our knowledge has not existed before: An overview of how the web breaks in practice on live websites. Analyzing this data shows that the bulk of the errors come from the same kinds of problems. This gives insight into how we developers should build the future of web technology: Fixing a small set of problems could reduce the number of software errors on the web tenfold.

The most common errors

Top 10 errors types by number of sites throwing the error at least once

It turns out the distribution of errors thrown on the web are highly Zipfian: A few error types make up most of the errors thrown. ReferenceError, TypeError and SyntaxError make up 85% of all unhandled errors. As the web developer Tolstoy put it: Working websites are all different, all broken websites break in the same way.

Of course, there are many ways to produce these error types. The specific string in the error message tell us more about what actually happened. Looking at the most common error messages gives a certain sense of familiarity. As a web developer, you've likely encountered some of these before.

Top 10 error messages by number of sites throwing the error at least once

We used these statistics on the specific error messages, and then went ahead and debugged random samples of these errors to get a qualitative understanding of what went wrong in each case. This yielded some surprising findings. It turns out that for both ReferenceError and SyntaxError, there is a single common root cause that produces most of them: Failures in resource loading. For TypeErrors, there's a similar finding that most of them essentially come from the same kind of problem. Our deep dive resulted in the following articles describing our findings for each error:

Predicting error count by library use

We initially ran a logistic regression classifier to try to predict the presence of errors in a web site from the presence of JavaScript libraries. This was based on the hypothesis that the presence of certain code would be predictive of causing errors. As stated above, deeper analysis would show that most of the errors were in fact caused by the absence of code, so the predictive ability of the classifier was low. Nevertheless, we've included the regression coefficients learned by the classifier below, showing what patterns the classifier chose to rely on. Indeed, the piece of code most strongly associated with few errors is webpack, whose purpose is making sure that the needed scripts are delivered to the browser. Another somewhat self-serving conclusion here is that JavaScript error tracking products are associated with a low error count. (Even if our own product is not yet big enough to be included in our own study!)

Error resiliency on the web

12% of sites in our sample had one or more unhandled errors. This is really a stunning number. Each of these errors indicate that some line of execution was aborted due to an unexpected situation, and likely indicates that some functionality is broken as a result. The number is also a testament to the error resiliency of the web: Whatever problems these errors indicate, they must be small enough that no one has bothered fixing them.

The data shows that most of the errors come from missing code, data or document elements at runtime. In some sense these errors are made possible by the late-binding nature of the web: Types are determined at runtime (late), as opposed to at compile time (early). Determining types at runtime means allows the loading libraries at runtime to be easy and natural. It also makes an entire class of errors possible: Errors coming from missing libraries and changing API surfaces. Of course, late-binding is not the only choice: Many languages base themselves on types being known at compile time. Had we collectively decided to build the web using Java Applets, the error landscape would have been quite different. How would it have looked?

How programming language research can create a less error prone web

In a language where the type system gives strict guarantees about the shape of types, any runtime dynamicity regarding loading of libraries becomes harder to do, especially if these libraries are allowed to evolve their API surface. This relates not only to linked code from the network, but also to the browser runtime. Looking back to time of Java Applets, if you didn't have the right Java runtime installed, the applet would refuse to run until you had downloaded and installed the appropriate JRE. On the web, you can view a page with an old browser, and perhaps expect a progressive breakdown correlated with the age difference of the browser and the site. It is certainly possible to write a web page that works correctly both in current and ancient browsers. In this view, late-binding may be a crucial building block to having an evolving web.

In 2006, Alan Kay and the Viewpoint Research Institute set out on an extremely ambitious project to reinvent computing, building a GUI operating system from bare metal in 20K lines of code. Although the project seems to have run out of funding before fully reaching its goal, the final report describes a system built around a principle of late-bound references and dynamicity, and was built in a language (KScript) resembling JavaScript.

Alan Kay miming an ecological, distributed system without tightly interlocked interfaces (source).

Clearly, the last word has not been said in this. The guarantees given by static typing allows the compiler to prove the absence of a certain class of errors, and this is something many programmers will not give up happily. TypeScript is in an interesting place here, straddling the world of both dynamic and static typing. This does come at a cost: What the compiler believes the types to be at compile time may not be what they actually are at runtime, but that may be a fine trade-off.

A consequence of the late-binding nature of the web, is that we get late breaking. If a web page expects a browser feature to be present that isn't there, the code will not break until the use of that feature is attempted. For the web, this seems better than the all-or-nothing nature of the Java Applet model, where nothing runs until the right runtime has been downloaded. A similar situation happened in the early 2000s with XHTML. With XHTML, a document was supposed to be valid XML, and invalid markup would cause the page to not display at all. At the time, this behavior was advocated by many people, perhaps because invalid HTML was seen as a cause of rendering inconsistencies between browsers. It took about a decade to come with the better idea of standardizing how invalid markup was handled, which was incorporated into HTML5. Clearly, in the end HTML5 won over XHTML, and JavaScript apps won over Java applets.

The dark horse in the current environment is WebAssembly. There are a number of interesting projects built around the idea of compiling code in statically typed languages to wasm (see for example Blazor). It is possible that we're at the brink of an era where a sizable portion of front-end development is done without JavaScript. However, it is hard to believe that these approaches can be successful if they rely on being a technological silo that cannot interact with anything outside its own runtime. If any lesson can be learned from history, it seems that having a good solution around dynamicity and allowing for the possibility of late-bound references will be necessary.

Allowing for runtime dynamicity while retaining some of the safety provided by statically typed languages may be the key to a less error-prone web. As the data shows, when the web breaks, it breaks because of untrue assumptions in the code, causing errors at runtime. The document is not as expected, types are not as expected, libraries and data have failed to load over the network. A fruitful avenue of programming language research may be type systems that bake in these assumptions, allowing the compiler to prove that the assumptions have been checked before they are relied on. If this can be achieved in an ergonomic way that is not a pain to use, it may allow for code that is capable of operating in a dynamic execution environment, while eliminating most of the errors that plague the web today.