ES2019 features coming to JavaScript (starring us!)

Shape Security has been contributing actively to TC39 and other standards bodies for the past 4 years but this year is special for us. A significant portion of the features coming to JavaScript as part of the 2019 update are from Shape Security engineers! Shape contributes to standards bodies to ensure new features are added while taking into account evolving security implications. Anything Shape contributes outside of this broad goal is because we believe the web platform is the greatest platform ever made and we want to help it grow even better.

Thanks to everyone who contributes to TC39 and thank you Michael Ficarra (@smooshMap), Tim Disney (@disnet), and Kevin Gibbons (@bakkoting) for representing Shape.

TL;DR

The 2019 update includes quality-of-life updates to JavaScript natives, and standardizes undefined or inconsistent behavior.

Buffed: String, Array, Object, Symbol, JSON

Nerfed: Try/Catch

Adjusted: Array.prototype.sort, Function.prototype.toString

Native API additions

Array.prototype.flat & .flatMap

> [ [1], [1, 2], [1, [2, [3] ] ] ].flat();
< [1, 1, 2, 1, [2, [3]]]

> [ [1], [1, 2], [1, [2, [3] ] ] ].flat(3);
< [1, 1, 2, 1, 2, 3]

The Array prototype methods flat and flatMap got unexpected attention this year, not because of their implementation, but because Shape Security engineer Michael Ficarra opened a gag pull request renaming the original method flatten to smoosh thus starting SmooshGate. Michael opened the pull request as a joke after long TC39 meetings on the topic and it ended up giving the average developer great insight into how TC39 works and under how big of a microscope proposals are placed under. When considering new features to add to JavaScript, the TC39 committee has to take two decades of existing websites and applications into account to ensure no new feature unexpectedly breaks them.

After FireFox shipped flatten in the nightly releases, users found that websites using the MooTools framework were breaking. MooTools had added flatten to the Array prototype ten years ago and now any site using MooTools risks breaking if the method changes. Since MooTools usage has declined in favor of more modern frameworks, many sites using the library are sites which are no longer actively maintained — they will not be updated even if MooTools released an updated version. SmooshGate ended up surfacing serious discussions as to what degree existing websites affect future and present innovation.

The committee concluded backwards compatibility was of higher importance and renamed the method flatten to flat. It’s a long, complicated story with an anticlimactic ending but that could be said of all specification work.

Drama aside, flat operates on an array and “flattens” nested arrays within to a configurable depth. flatMap operates similarly to the map method by applying a function to each element in the list and then calling flat() on the resulting list.

Object.fromEntries

let obj = { a: 1, b: 2 };
let entries = Object.entries(obj);
let newObj = Object.fromEntries(entries);

Object.fromEntries is a complement to the Object.entries method which allows a developer to more succinctly translate objects from one another. Object.entries takes a regular JavaScript object and returns a list of [key, value] pairs, Object.fromEntries enables the reverse.

String.prototype.trimStart & .trimEnd

> '   hello world   '.trimStart()
< "hello world   "

> '   hello world   '.trimEnd()
< "   hello world"

Major JavaScript engines had implementations of String.prototype.trimLeft() and String.prototype.trimRight() but the methods lacked a true definition in the spec. This proposal standardizes the names as trimStart and trimEnd, aligning terminology with padStart and padEnd, and aliases trimLeft and trimRight to the respective function.

Symbol.prototype.description

> let mySymbol = Symbol('my description');
< undefined

> mySymbol.description
< 'my description'

Symbol.prototype.description is an accessor for the unexposed description property. Before this addition, the only way to access the description passed into the constructor was by converting the Symbol to a string via toString() and there was no intuitive way to differentiate between Symbol() and Symbol(‘’).

Spec & Language Cleanup

Try/Catch optional binding

try {
  throw new Error();
} catch {
  console.log('I have no error')
}

Until this proposal, omitting the binding on catch resulted in an error when parsing the JavaScript source text. This resulted in developers putting in a dummy binding despite them being unnecessary and unused. This is another quality-of-life addition allowing developers to be more intentional when they ignore errors, improving the developer experience and reducing cognitive overhead for future maintainers.

Make ECMAScript a proper superset of JSON

JSON.parse describes JSON as a subset of JavaScript despite valid JSON including Unicode line separators and paragraph separators not being valid JavaScript. This proposal modifies the ECMAScript specification to allow those characters in string literals. The majority of developers will never encounter this usage but it reduces edge case handling for developers dealing with the go-between and generation of JavaScript and JSON. Now you can insert any valid JSON into a JavaScript program without accounting for edge cases in a preprocessing stage.

Well-formed JSON.stringify

> JSON.stringify('\uD834\uDF06')
< "\"𝌆\""

> JSON.stringify('\uDF06\uD834')
< "\"\\udf06\\ud834\""

This proposal rectifies inconsistencies in description and behavior for JSON.stringify. The ECMAScript spec describes JSON.stringify as returning a UTF-16 encoded JSON format string but can return values that are invalid UTF-16 and are unrepresentable in UTF-8 (specifically surrogates in the Unicode range U+D800U+DFFF). The accepted resolution is, when encoding lone surrogates, to return the code point as a Unicode escape sequence.

Stable Array.prototype.sort()

This is a change to the spec reflecting the behavior standardized by practice in major JavaScript engines. Array.prototype.sort is now required to be stable — values comparing as equal stay in their original order.

Revised Function.prototype.toString()

The proposal to revise Function.prototype.toString has been a work-in-progress for over 3 years and was another proposed and championed by Michael Ficarra due to problems and inconsistencies with the existing spec. This revision clarifies and standardizes what source text toString() should return or generate for functions defined in all the different forms. For functions created from parsed ECMAScript source, toString() will preserve the whole source text including whitespace, comments, everything.

Onward and upward

ES2015 was a big step for JavaScript with massive new changes and, because of the problems associated with a large change set, the TC39 members agreed it is more sustainable to produce smaller, yearly updates. Most of the features above are already implemented in major JavaScript engines and can be used today.

If you are interested in reading more TC39 proposals, including dozens which are in early stages, the committee makes its work available publicly on Github.com. Take a look at some of the more interesting proposals like the pipeline operator and optional chaining.

Contributing to the Future

The mission of the Web Application Security Working Group, part of the Security Activity, is to develop security and policy mechanisms to improve the security of Web Applications, and enable secure cross-origin communication.

If you work with websites, you are probably already familiar with the web features that the WebAppSec WG has recently introduced. But not many web developers (or even web security specialists) feel comfortable reading or referencing the specifications for these features.

I typically hear one of two things when I mention WebAppSec WG web features:

  1. Specifications are hard to read. I’d rather read up on this topic at [some other website].
  2. This feature does not really cover my use-case. I’ll just find a workaround.

Specifications are not always the best source if you are looking to get an introduction to a feature, but once you are familiar with it, the specification should be the first place you go when looking for a reference or clarification. And if you feel the language of the specification can be better or that more examples are needed, go ahead and propose the change!

To cover the second complaint, I’d like to detail out our experience contributing a feature to a WebAppSec WG specification. I hope to clarify the process and debunk the myth that your opinion is going to be unheard or that you can’t, as an individual, make a meaningful contribution to a web feature used by millions.

Background

Content Security Policy is a WebAppSec WG standard that allows a website author to, among other things, declare the list of endpoints with which a web page is expecting to communicate. Another great WebAppSec WG standard, SRI, allows a website author to ensure that the resources received by their web page (like scripts, images, and stylesheets) have the expected content. Together, these web features significantly reduce the risk that an attacker can substitute web page resources for their own malicious resources.

I helped standardise and implement require-sri-for, a new CSP directive that mandates SRI integrity metadata to be present before requesting any subresource of a given type.

Currently, SRI works with resources referenced by script and link HTML elements. Also, the Request interface of the Fetch API allows to specify the expected integrity metadata. For example, Content-Security-Policy: require-sri-for script style; extends the expectations and forbids pulling in any resources of a given type without integrity metadata.

Contributing to a WebAppSec WG Specification

Unlike some other working groups, there is no formal process on how to start contributing to W3C’s Web Application Security Working Group specifications, and it might look scary. It is actually not, and usually flows in the following order:

  1. A feature idea forms in somebody’s head enough to be expressed as a paragraph of text.
  2. A paragraph or two is proposed either in WebAppSec’s public mailing list or in the Github Issues section of the relevant specification. Ideally, examples, algorithms and corner-cases are included.
  3. After some discussion, which can sometimes take quite a while, the proposal starts to be formalised as a patch to the specification.
  4. The specification strategy is debated, wording details are finalised, and the patch lands in the specification.
  5. Browser vendors implement the feature.
  6. Websites start using the feature.

Anyone can participate in any phase of feature development, and I’ll go over the require-sri-fordevelopment timeline to highlight major phases and show how we participated in the process.

Implementing require-sri-for

  1. Development started in an issue on the WebAppSec GitHub repo opened back in April 2014 by Devdatta Akhawe. He wonders how one might describe a desire to require SRI metadata, e.g. the integrity hash, be present for all subresources of a given type.
  2. Much time passes. SRI is supported by Chrome and Firefox. Github is among first big websites to use it in the wild.
  3. A year later, Github folks raise the same question that Dev did on the public-webappsec mailing list, with an addition of having an actual use case: they intended to have an integrity attribute on every script they load, but days later after deplyoing the SRI feature, determined that they had missed one script file.
  4. Neil from Github Security starts writing up a paragraph in the CSP specification to cover a feature that would enforce integrity attribute on scripts and styles. Lots of discussion irons out the details that were not covered in the earlier email thread.
  5. I pick up the PR and move it to the SRI specification GitHub repo. 65 comments later, it lands in the SRI spec v2.
  6. Frederik Braun patches Firefox Nightly with require-sri-for implementation.
  7. I submit a PR to Chromium with basic implementation. 85 comments later, it evolves into a new Chrome platform feature and lands with plans to be released in Chrome 54.

Resources

Specification development is happening on Github, and there are many great specifications that you should be looking at:

We Are Hiring!

If you reached here, there is a chance that Shape has a career that looks interesting to you.

Browser Compatibility Testing with BrowserStack and JSBin

There are some great services out there that go a long way to outline browser compatibility of JavaScript, HTML, and CSS features. Can I Use? is one of the most thorough, and kangax’s ES5 and ES6 compatibility table are extremely valuable and regularly updated. The Mozilla Developer Network also has a browser support matrix for almost every feature that is documented, but it is a public wiki and a lot of the information comes from the previously mentioned sources. Outside of those resources, the reliability starts dropping quickly.

What if the feature you’re looking for isn’t popular enough to be well documented? Maybe you’re looking for confirmation of some combination of esoteric behaviors that isn’t documented anywhere?

BrowserStack & JSBin

BrowserStack has a service that allows you to take screenshots across a massive number of browsers. Services like these are often used to quickly verify that sites look the same or similar across a wide array of browsers and devices.

While that may not sound wildly exciting for our use case, we can easily leverage a service like jsbin to whip up a test case that exposes the pass or failure in a visually verifiable way.

Using a background color on the body (red for unsupported, green for supported) we can default an html body to the ‘unsupported’ class and, upon a successful test for our feature, simply change the class of the body to ‘supported’. For example, we recently wanted to test the browser support for document.currentScript, a feature that MDN says is not well supported but according to actual, real world chrome and IE/spartan developers, is widely supported and has been for years!

Well we had to know, so we put together a jsbin to quickly test this feature.


<html>
<head>
  <meta charset="utf-8">
  <title>JS Bintitle>
  <style>
    .unsupported{
      background-color:red;
    }
    .supported{
      background-color:green;
    }
  style>
head>
<body class=unsupported>
  <script id=foo>script>
  <script id=target>
    if (document.currentScript && document.currentScript.id === 'target')
      document.body.className = 'supported';
  script>
  <script id=bar>script>
body>
html>

After the bin is saved and public, we select a wide range of browser we wanted to test in browserstack and pass it the output view of the JSBin snippet. In this case we wanted to test way back to IE6, a range of ios and android browsers, opera, safari, and a random sampling of chrome and firefox.

browserstackss

After a couple minutes, we get our results back from BrowserStack and we can immediately see what the support matrix looks like.

browserstackresults

This binary pass/fail approach may not scale too well but you could get creative and produce large text, a number, or even an image that is easily parsable at varying resolutions, like a QR code, to convey test results. The technique is certainly valuable for anyone that doesn’t have an in-house cluster of devices, computers, and browsers at their disposal simply for browser testing. Credit goes to Michael Ficarra (@jspedant) for the original idea.

Oh, and by the way, document.currentScript is definitely not widely supported.

Detecting PhantomJS Based Visitors

These days, many web security incidents involve automation. Web-scraping, password reuse, and click-fraud attacks are perpetrated by adversaries trying to mimic real users, and thus will attempt to look like they are coming from a browser. As a website owner, you want to ensure you serve humans, and as a web service provider you want programmatic access to your content to go through your API instead of being scraped through your heavier and less stable web interface.

Assuming that you have basic checks for cURL-like visitors, the next reasonable step is to ensure that visitors are using real, UI-driven browsers — and not headless browsers like PhantomJS and SlimerJS.

In this article, we’re going to demonstrate some techniques for identifying visits by PhantomJS. We decided to focus on PhantomJS because it is the most popular headless browser environment, but many of the concepts that we’ll cover are applicable to SlimerJS and other tools.

NOTE: The techniques presented in this article are applicable to both PhantomJS 1.x and 2.x, unless explicitly mentioned. First up: is it possible to detect PhantomJS without even responding to it?

HTTP stack

As you may be aware, PhantomJS is built on the Qt framework. The way Qt implements the HTTP stack makes it stick out from other modern browsers.

First, let’s take a look at Chrome, which sends out the following headers:

GET / HTTP/1.1
Host: localhost:1337
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,ru;q=0.6

In PhantomJS, however, the same HTTP request looks like this:

GET / HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.8 Safari/534.34
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Connection: Keep-Alive
Accept-Encoding: gzip
Accept-Language: en-US,*
Host: localhost:1337

You’ll notice the PhantomJS headers are distinct from Chrome (and, as it turns out, all other modern browsers) in a few subtle ways:

  • The Host header appears last
  • The Connection header value is mixed case
  • The only Accept-Encoding value is gzip
  • The User-Agent contains “PhantomJS”

Checking for these HTTP header aberrations on the server, it should be possible to identify a PhantomJS browser.

But, is it safe to believe these values? If an adversary uses a proxy to rewrite headers in front of the headless browser, they could modify those headers to appear like a normal modern browser instead.

Looks like tackling this problem purely on the server is not a silver bullet. So let’s take a look at what can be done on the client, using PhantomJS’s JavaScript environment.

Client-side User-Agent Check

We may not be able to trust the User-Agent value as delivered via HTTP, but what about on the client?

if (/PhantomJS/.test(window.navigator.userAgent)) {
    console.log("PhantomJS environment detected.");
}

Unfortunately, it is similarly trivial to change user-agent header and navigator.userAgent values in PhantomJS, so this might not be enough.

Plugins

navigator.plugins contains an array of plugins that are present within the browser. Typical plugin values include Flash, ActiveX, support for Java applets, and the “Default Browser Helper”, which is a plugin that indicates whether this browser is the default browser in OS X. In our research, most fresh installs of common browsers include at least one default plugin — even on mobile.

This is unlike PhantomJS, which doesn’t implement any plugins, nor does it provide a way to add one (using the PhantomJS API).

The following check might then be useful:

if (!(navigator.plugins instanceof PluginArray) || navigator.plugins.length == 0) {
    console.log("PhantomJS environment detected.");
} else {
    console.log("PhantomJS environment not detected.");
}

On the other hand, it’s fairly trivial to spoof this plugin array by modifying the PhantomJS JavaScript environment before the page is loaded.

It’s also not difficult to imagine a custom build of PhantomJS with real, implemented plugins. This is easier than it sounds because the Qt framework on which PhantomJS is built provides a native API for implementing plugins.

Timing

Another point of interest is how PhantomJS suppresses JavaScript dialogs:

var start = Date.now();
alert('Press OK');
var elapse = Date.now() - start;
if (elapse < 15) {
    console.log("PhantomJS environment detected. #1");
} else {
    console.log("PhantomJS environment not detected.");
}

After measuring several times, it appears that if the alert dialog is suppressed within 15 milliseconds, the browser is probably not being controlled by a human. But using this approach means bothering real users with an alert they’ll manually have to close.

Global Properties

PhantomJS 1.x exposes two properties on the global object:

if (window.callPhantom || window._phantom) {
  console.log("PhantomJS environment detected.");
} else {
  console.log("PhantomJS environment not detected.");
}

However, these properties are part of an experimental feature and may change in the future.

Lack of JavaScript Engine Features

PhantomJS 1.x and 2.x currently use out-of-date WebKit engines, which means there are browser features that exist in newer browsers that do not exist in PhantomJS. This extends to the JavaScript engine — whereby some native properties and methods are different or absent in PhantomJS.

One such method is Function.prototype.bind, which is missing in PhantomJS 1.x and older. The following example checks whether bind is present, and that it has not been spoofed in the executing environment.

(function () {
  if (!Function.prototype.bind) {
    console.log("PhantomJS environment detected. #1");
    return;
  }
  if (Function.prototype.bind.toString().replace(/bind/g, 'Error') != Error.toString()) {
    console.log("PhantomJS environment detected. #2");
    return;
  }
  if (Function.prototype.toString.toString().replace(/toString/g, 'Error') != Error.toString()) {
    console.log("PhantomJS environment detected. #3");
    return;
  }
  console.log("PhantomJS environment not detected.");
})();

This code is a little too tricky to explain in detail here, but you can find out more from our presentation.

Stack Traces

Errors thrown by JavaScript code evaluated by PhantomJS via the evaluate command contain a uniquely identifiable stack trace, from which we can identify the headless browser.

For example, suppose that PhantomJS calls evaluate on the following code:

var err;
try {
  null[0]();
} catch (e) {
  err = e;
}
if (indexOfString(err.stack, 'phantomjs') > -1) {
  console.log("PhantomJS environment detected.");
} else {
  console.log("PhantomJS environment is not detected.");
}

Note that this example uses a custom indexOfString() function, left as an exercise for the reader, since the native String.prototype.indexOf can be spoofed by PhantomJS to always return a negative result.

Now, how do you get a PhantomJS script to evaluate this code? One technique is to override some frequently used DOM API functions that are likely to be called. For example, the code below overrides document.querySelectorAll to inspect the browser’s stack trace:

var html = document.querySelectorAll('html');
var oldQSA = document.querySelectorAll;
Document.prototype.querySelectorAll = Element.prototype.querySelectorAll = function () {
  var err;
  try {
    null[0]();
  } catch (e) {
    err = e;
  }
  if (indexOfString(err.stack, 'phantomjs') > -1) {
    return html;
  } else {
    return oldQSA.apply(this, arguments);
  }
};

Summary

In this article we’ve looked at 7 different techniques for identifying PhantomJS, both on the server and by executing code in PhantomJS’s client JavaScript environment. By combining the detection results with a strong feedback mechanism — for example, rendering a dynamic page inert, or invalidating the current session cookie — you can introduce a solid hurdle for PhantomJS visitors. Always keep in mind however, that these techniques are not infallible, and a sophisticated adversary will get through eventually.

To learn more, we recommend watching this recording of our presentation from AppSec USA 2014 (slides). We’ve also put together a GitHub repository of example implementations — and possible circumventions — of the techniques presented here.

Thanks for reading, and happy hunting.

Contributors