Pokémon Go API – A Closer Look at Automated Attacks

Tens of millions of people are out exploring the new world of Pokémon Go. It turns out that many of those users are not people at all, but automated agents, or bots. Game-playing bots are not a new phenomenon, but Pokémon Go offers some new use cases for bots. These bots have started interfering with everyone’s fun by overwhelming Pokémon Go servers with automated traffic. Pokémon Go is a perfect case study in how automated attacks and defenses work on mobile APIs. At Shape we deal with these types of attacks every day, so we thought we would take a closer look at what happened with the Pokémon Go API attacks.

Pokémon Go API Attack

Niantic recently published a blog post detailing the problems bots were creating through the generation of automated traffic, which actually hindered their Latin America launch. The chart included in the post depicts a significant spatial query traffic drop since Niantic rolled out countermeasures for the automation at 1pm PT 08/03. The automated traffic appears to have been about twice that of the traffic from real human players. No wonder Pokémon Go servers were heavily overloaded in recent weeks.

server_resourcesFigure 1. Spatial query traffic dropped more than 50% since Niantic started to block scrapers. Source: Niantic blog post

Getting to Know The Pokémon Bots

There are two types of Pokémon bots. The first type of bot automates regular gameplay and is a common offender on other gaming apps, automating activities such as walking around and catching Pokémon. Examples of such bots include MyGoBot and PokemonGo-Bot. But Pokémon Go has inspired the development of a new type of bot, called a Tracker or Mapper, which provides the location of Pokémon. These bots power Pokémon Go mapping services such as Pokevision and Go Radar.

How a Pokémon Go Bot Works

A mobile API bot is a program that mimics communication between a mobile app and its backend servers—in this case servers from Niantic. The bot simply tells the servers what actions are taken and consumes the server’s response.

Figure 2 shows a screenshot of a Pokémon Go map which marks nearby Pokémon within a 3-footstep range of a given location. To achieve this, the bot makers usually follow these steps:

  1. Reverse-engineer the communication protocol between the mobile app and the backend server. The bot maker plays the game, captures the communications between the app and its server, and deciphers the protocol format.
  2. Write a program to make series of “legitimate” requests to backend servers to take actions. In this case, getting locations of nearby Pokémon is a single request with a targeted GPS coordinate, without the real walk to the physical location. The challenge to the bot is to bypass a server’s detection and look like a real human.
  3. Provide related features such as integration with Google Maps, or include the bot’s own mapping functionality for the user.

pokemon-map.pngFigure 2. Screenshot of a Pokémon Go map

Mobile App Cracks and Defenses

Using Pokémon Go app as an example, let’s examine how a mobile app is cracked by reverse engineering to reveal its secrets. Since attackers mainly exploited Pokémon Go’s Android app, let’s focus on Android app cracks and defenses.

Reverse-Engineering the Protocol

The Pokémon Go app and the backend servers communicate using ProtoBuf over SSL. ProtoBuf defines the data format transferred on the network. For example, here is an excerpt of the ProtoBuf definition for player stats:

message PlayerStats {
  int32 level = 1;
  int64 experience = 2;
  int64 prev_level_xp = 3;
  int64 next_level_xp = 4;
  float km_walked = 5;
  int32 pokemons_encountered = 6;
  int32 unique_pokedex_entries = 7;
  ……
}

Pokémon Go was reverse-engineered and published online by POGOProtos within only two weeks. How did this happen so quickly? Initially, Niantic didn’t use certificate pinning.

Certificate pinning is a common approach used against Man-in-the-Middle attacks. In short, a mobile app only trusts server certificates which are embedded in the app itself. Without certificate pinning protection, an attacker can easily set up a proxy such as Mitmproxy or Fiddler, and install the certificate crafted by the attacker to her phone. Next she can configure her phone to route traffic through the proxy and sniff the traffic between the Pokémon Go app and the Niantic servers. There is actually a Pokémon Go-specific proxy tool that facilitates this called pokemon-go-mitm.

On July 31, Niantic made a big change on both its server and its Pokémon Go app. Pokémon Go 0.31.0 was released with certificate pinning protection. Unfortunately, the cat was out of the bag and the communication protocol was already publicly available on GitHub. In addition, implementing certificate pinning correctly is not always easy. In the later sections, we will cover some techniques commonly used by attackers to bypass certificate pinning.

APK Static Analysis

The Android application package (APK) is the package file format used by Android to install mobile apps. Android apps are primarily written in Java, and the Java code is compiled into dex format and built into an apk file. In addition, Android apps may also call shared libraries which are written in native code (Android NDK).

Dex files are known to be easily disassembled into SMALI languages, using tools such as Baksmali. Then tools such as dex2jar and jd-gui further decompile the dex file into Java code, which is easy to read. Using these techniques, attackers decompiled the Pokémon Go Android app (version 0.29.0 and 0.31.0) into Java code. The example code shown below implements certificate pinning from the com.nianticlabs.nia.network.NianticTrustManager class.

public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException {
  synchronized (this.callbackLock) {
    nativeCheckServerTrusted(chain, authType);
  }
}

When application source code is exposed, reverse engineering becomes a no-brainer. Pokemon Go Xposed used less than 100 lines of Java code to fool the Pokémon Go app into believing the certificate from the MITMProxy was the authentic certificate from Niantic.

How did Pokemon Go Xposed achieve this? Quite easily. The tool simply hooks to the call of the function checkServerTrusted mentioned in the above code snippet. The hook changes the first parameter of the function, chain, to the value of Niantic’s certificate. This means that no matter what unauthorized certificate the proxy uses, the Pokémon Go app is tricked into trusting the certificate.

There are many tools that can help make disassembly and static analysis by attackers more difficult. ProGuard and DexGuard are tools that apply obfuscation to Java code and dex files. Obfuscation makes the code difficult to read, even in decompiled form. Another approach is to use Android packers to encrypt the original classes.dex file of Android apps. The encrypted dex file is decrypted in memory at runtime, making static analysis extremely hard, if not impossible, for most attackers. Using a native library is another way to significantly increase the difficulty to reverse-engineer the app.

Reverse-Engineering the Native Library

The most interesting cat-and-mouse game between the pokemongodev hackers and Niantic was around the field named “Unknown6”, which was contained in the signature sent in the map request to get nearby Pokémon at a location. “Unknown6” is one of the unidentified fields in the reverse-engineered protobuf. Initially, it wouldn’t matter what value Unknown6 was given; Niantic servers just accepted it. Starting at 1pm PT on 08/03, all Pokémon Go bots suddenly could not find any Pokémon, which eventually resulted in the significant query drop in Figure 1.

The hackers then noticed the importance of the “Unknown6” field in the protocol, and initially suspected Unknown6 to be some kind of digest or HMAC to validate the integrity of the request. This triggered tremendous interest from the pokemongodev community and an “Unknown6” team was quickly formed to attempt to crack the mysterious field. The Discord channel went private due to the wide interest from coders and non-programmers, but a live update channel kept everybody updated on the progress of the cracking effort. After 3 days and 5 hours, in the afternoon of 08/06, the Unknown6 team claimed victory, releasing an updated Pokémon Go API that was once again able to retrieve nearby Pokémon.

While the technical writeup of the hack details has yet to be released, many relevant tools and technologies were mentioned on the forums and the live update. IDA-Pro from Hex-Rays is a professional tool that is able to disassemble the ARM code of a native library, and the new Hex-Rays decompiler can decompile a binary code file into a C-style format. These tools allow attackers to perform dynamic analysis, debugging the mobile app and its libraries at run time. Of course, even with such powerful tools, reverse-engineering a binary program is still extremely challenging. Without any intentional obfuscation, the disassembled or decompiled code is already hard to understand, and the code size is often huge. As an illustration of the complex and unpredictable work required, the live update channel and a subsequent interview described how the encryption function of “Unknown6” was identified within hours but the team spent an extensive amount of additional time analyzing another field named “Unknown22”, which turned out to be unrelated to Unknown6.

As a result, obfuscation still has many practical benefits for protecting native libraries. A high level of obfuscation in a binary may increase the difficulty of reverse-engineering by orders of magnitude. However, as illustrated by the many successful cracks of serial codes for Windows and Windows applications, motivated mobile crackers are often successful.

Server Side Protection

Server-side defenses work in a completely different way than client-side defenses. Here are some of the techniques used in the context of protecting Pokémon Go’s mobile API.

Rate limiting

Rate limiting is a common approach to try to stop, or at least slow down, automated traffic. In the early days, Pokémon scanners were able to send tens of requests per second, scan tens of cells, and find every Pokémon.

On 07/31, Niantic added rate limiting protections. If one account sent multiple map requests within ~5 seconds, Niantic’s servers would only accept the first request and drop the rest. Attackers reacted to these rate limits by: a) Adding a delay (5 seconds) between map requests from their scanning programs b) Using multiple accounts and multiple threads to bypass the rate limit

In the case of Pokémon Go, the use of rate-limiting just opened another battleground for automated attacks: automated account creation. They quickly discovered that while rate limiting is a fine basic technique to control automation from overly aggressive scrapers or novice attackers, it does not prevent advanced adversaries from getting automated requests through.

IP Blocking

Blocking IPs is a traditional technique used by standard network firewalls or Web Application Firewalls (WAFs) to drop requests from suspicious IPs. There are many databases that track IP reputation and firewalls and WAFs can retrieve such intelligence periodically.

In general, IP-based blocking is risky and ineffective. Blindly blocking an IP with a large volume of traffic may end up blocking the NAT of a university or corporation. Meanwhile, many Pokémon bots or scanners may use residential dynamic IP addresses. These IPs are shared by the customers of the ISPs, so banning an IP for a long time may block legitimate players.

Hosting services such as Amazon Web Services (AWS) and Digital Ocean are also sources for attackers to get virtual machines as well as fresh IPs. When attackers use stolen credit cards, they can even obtain these resources for free. However, legitimate users will never use hosting services to browse the web or play games, so blocking IPs from hosting services is a safe defense and is commonly used on server side. Niantic may decide to ban IPs from AWS according to this forum post.

Behavior Analysis

Behavior analysis is usually the last line of defense against advanced attackers that are able to bypass other defenses. Bots have very different behaviors compared to humans. For example, a real person cannot play the game 24×7, or catch 5 Pokémon in one second. While behavioral analysis sounds a promising approach, building an accurate detection system to handle the huge data volume like Pokémon Go isn’t an easy task.

Niantic just implemented a soft ban on cheaters who use GPS spoofing to “teleport” (i.e., suddenly moving at an impossibly fast speed). It was probably a “soft ban” because of false positives; families share accounts and GPS readings can be inaccurate, making some legitimate use cases seem like bots.

On around Aug 12 2016, Niantic posted a note on its website, and outlined that violation of its terms of service may result in a permanent ban on a Pokémon Go account. Multiple ban rules targeting bots were also disclosed unofficially. For example, the Pokemon over-catch rule bans accounts when they catch over a thousand Pokemon in a single day. In addition Niantic encourages legitimate players to report cheaters or inappropriate players.

In our experience, behavioral modeling-based detection can be extremely effective but is often technically or economically infeasible to build in-house. As Niantic commented in their blog post, “dealing with this issue also has opportunity cost. Developers have to spend time controlling this problem vs. building new features.” The bigger issue is that building technology to defend against dedicated, technically-savvy adversaries armed with botnets and other tools designed to bypass regular defenses, requires many highly specialized skillsets and a tremendous development effort.

The Game Isn’t Over

As Pokémon Go continues to be loved by players, the game between bot makers and Niantic will also continue. Defending against automated traffic represents a challenge not only for gaming but for all industries. Similar attack and defense activities are taking place across banking, airline, and retailer apps where the stakes are orders of magnitude higher than losing a few Snorlaxes. Bots and related attack tools aren’t fun and games for companies when their customers and users cannot access services because of unwanted automated traffic.

Announcing Bandolier

Today Shape Security is releasing Bandolier, a Java library that bundles JavaScript written with ES2015 module syntax.

Bandolier takes JavaScript code like this:

import { b } from './foo.js'
console.log(42 + b);

where the foo module is defined as:

// foo.js
export var b = 100;

and produces a single script without ES2015 module syntax that can run in a JavaScript environment that does not yet support import/export:

(function(global) {
  "use strict";

  function require(file, parentModule) {
    // eliding the definition of require
    // ...
  }

  require.define("1", function(module, exports, __dirname, __filename) {
    var __resolver = require("2", module);
    var b = __resolver["b"];
    console.log(42 + b);
  });
  require.define("2", function(module, exports, __dirname, __filename) {
    var b = 100;
    exports["b"] = b;
  });
  return require("1");
}.call(this, this));

Bandolier is a good example of a non-trivial project built using the Shift AST; Bandolier essentially takes a bunch of Module ASTs that contain import and export declarations and appropriately merges them into a single Script AST.

Bandolier works by first parsing the given JavaScript file into a Module AST using the Shift Java parser. It then transforms the AST by resolving each import declaration’s module specifier (e.g. converting import foo from "some/module" to import foo from "/full/path/to/some/module"). Once all the imports are resolved, each imported module is recursively loaded and stored in memory.

Finally, the bundled script is created by generating the module loading boilerplate (the function wrapper and the require function) and then each loaded module is transformed by changing import declarations to require calls and export declarations to updates to the exports object.

One particularly useful feature of Bandolier is that both the resolving and loading phases are pluggable. Bandolier comes with a few choices built-in including:

  • a FileSystemResolver that just normalizes relative paths
  • a NodeResolver that follows the node require.resolve algorithm
  • a FileLoader for loading resources from the file system
  • a ClassResourceLoader for loading resources inside a JAR.

Writing your own custom loader or resolver is as simple as implementing the IResolver and IResourceLoader interfaces.

Note that Bandolier is not a full transpiler like babel; it only transforms import and export statements. That said, the Shift parser fully supports ES2015 so you can, for example, use ES2015 classes and the bundled output will work in any JavaScript environment that supports classes (e.g. recent versions of node).

Also note that Bandolier only bundles ES2015 modules so if you need to do something more complex, like bundling CommonJS modules, you will probably be more happy with something like browserify, CommonJS Everywhere, or webpack.

What sets Bandolier apart from similar projects, and why we built it at Shape, is that it allows you to easily integrate JavaScript bundling into a Java application. We use it to dynamically generate and bundle our JavaScript resources on-the-fly inside a Java server. So, if you have similar needs (or are just interested in how to use the Shift AST) check out the project on github.

Salvation is Coming (to CSP)

CSP (Content Security Policy) is a W3C candidate recommendation for a policy language that can be used to declare content restrictions for web resources, commonly delivered through the Content-Security-Policy header. Serving a CSP policy helps to prevent exploitation of cross-site scripting (XSS) and other related vulnerabilities. CSP has wide browser support according to caniuse.com.

Content-Security-Policy-1.0

Content-Security-Policy-Level-2

There’s no downside to starting to use CSP today. Older browsers that do not recognise the header or future additions to the specification will safely ignore them, retaining the current website behaviour. Policies that use deprecated features will also continue to work, as the standard is being developed in a backward compatible way. Unfortunately, our results of scanning the Alexa top 50K websites for CSP headers align with other reports which show that only major web properties like Twitter, Dropbox, and Github have adopted CSP. Smaller properties are not as quick to do so, despite how relatively little effort is needed for a potentially significant security benefit. We would be happy to see CSP adoption grow among smaller websites.

Writing correct content security policies is not always straightforward, and mistakes make it into production. Browsers will not always tell you that you’ve made a typo in your policy. This can provide a false sense of security.

Announcing Salvation

Today, Shape Security is releasing Salvation, a FOSS general purpose Java library for working with Content Security Policy. Salvation can help with:

  • parsing CSP policies into an easy-to-use representation
  • answering questions about what a CSP policy allows or restricts
  • warning about nonsensical CSP policies and deprecated or nonstandard features
  • safely creating, manipulating, and merging CSP policies
  • rendering and optimising CSP policies

We created Salvation with the goal of being the easiest and most reliable standalone tool available for managing CSP policies. Using this library, you will not have to worry about tricky cases you might encounter when manipulating CSP policies. Working on this project helped us to identify several bugs in both the CSP specification and its implementation in browsers.

Try It Out In Your Browser

We have also released cspvalidator.org, which exposes a subset of Salvation’s features through a web interface. You can validate and inspect policies found on a public web page or given through text input. Additionally, you can try merging CSP policies using one of the two following strategies:

  • Intersection combines policies in such a way that the result will behave similar to how browsers enforce each policy individually. To better understand how it works, try to intersect default-src a b with default-src; script-src *; style-src c.
  • Union, which is useful when crafting a policy, starting with a restrictive policy and allowing each resource that is needed. See how union merging is not simply concatenation by merging script-src * with script-src a in the validator.

Contribute

You can check out the source code for Salvation on Github or start using it today by adding a dependency from Maven Central. We welcome contributions to this open source project.

Contributors

Two-Phase Parsing in the Shift JavaScript Parser

Today, we merged the two-phase-parsing branch of the Shift Parser. This branch implemented an experiment with a significant change to the parser’s architecture. To understand it, we first need to understand ECMAScript early errors, and how they are typically handled.

Early Errors

The ECMAScript language is defined using a formal grammar, specifically a context-free grammar. A context-free grammar consists of a number of productions, each associating a symbol called a nonterminal with a (possibly empty) sequence of nonterminal and terminal symbols. Section 5.1 of the ECMAScript 6 specification explains the meaning of its grammars in great detail, but I will summarise it below.

There are two ECMAScript grammars: the lexical grammar, given in section 11, and the syntactic grammar, given in sections 11 through 15. The terminal symbols for the lexical grammar are Unicode code points (the text of your program), while the terminal symbols for the syntactic grammar are tokens, sequences of Unicode code points in the language defined by the lexical grammar.

When a program is parsed as a Script or Module (the two goal symbols of the syntactic grammar), it is first converted to a stream of tokens by repeatedly applying the lexical grammar to the remainder of the program. The token stream only represents an ECMAScript program if the token stream can be parsed by a single application of the syntactic grammar with no tokens left over.

An early error is an additional condition that must hold when a grammar production is matched. From section 5.3 of the ECMAScript 6 specification,

A conforming implementation must, prior to the first evaluation of a Script, validate all of the early error rules of the productions used to parse that Script. If any of the early error rules are violated the Script is invalid and cannot be evaluated.

Let’s take a look at an early error definition from the specification.

12.5.1 Static Semantics: Early Errors

UnaryExpression :
  ++ UnaryExpression
  -- UnaryExpression

It is an early Reference Error if IsValidSimpleAssignmentTarget of UnaryExpression is false.

This early error prevents constructions like ++0 because the named static semantic rule IsValidSimpleAssignmentTarget is true only for identifiers (a), static member access (a.b), computed member access (a[b]), and any of the previous productions enclosed in parentheses ((a)). Notably absent is Literal, the production that matches 0.

The final version of the ECMAScript 6 specification has over one hundred early errors. I know this because I had to write tests for every single one of them. It took a while.

Typical Early Error Handling

Not all early errors are as simple as the one above, where you can immediately know of the existence of an early error. For example, an object literal must not have more than one __proto__ data property. A label cannot be contained within another label with the same name (without crossing function boundaries). As you can imagine, tracking all of the information required to make these decisions can require a lot of complexity. Typical parsers like esprima and acorn use the following strategies:

Lookup Tables

Esprima has an object called state with a field named labelSet. Label names are added to this object when encountered, but if the label already exists, an early error is thrown. Because this state is unique only per parser instance, a stack of labelSet objects must be maintained (in this case, they are using the call stack) so that a fresh labelSet may be generated as each function is entered.

Wrapped Return Values

When parsing function parameters, some early errors must be collected and saved for later, in case it is determined that the parameters are in strict mode and violate a strict-mode-only early error. Remember that a function with a "use strict" directive in its body is strict mode code – including its parameters. Because of this, the function that parses a parameter must wrap all of its return values in a wrapper object that contains both the parameter node and any early errors that would be produced in a strict mode context.

Out Parameters

To avoid wrapper objects in some esprima parsing functions, an object like { value: false } will be passed to the function as an argument. The parsing function will then be able to mutate its parameter as well as return a value, essentially returning more than one value. This strategy is used when the same object can be re-used multiple times, such as the indicator that a data property with the name __proto__ has been parsed by the object property parsing function (the mutation hasProto.value = true is idempotent).

Two-Phase Parsing

We initially implemented our Shift parser in the same way as these others, but it ended up just as unruly and difficult to maintain. So we decided to remove all of the early error handling from the parser, and add an optional second pass that would detect the early errors. This means that the first pass would produce an AST for any input that passed the syntactic grammar. There were some risks with this: we needed to make sure we preserved enough information in the AST to determine if there should be an early error, and we needed to make sure the AST could represent all grammatically correct programs that would otherwise be invalid due to early errors.

The first step was to separate the failure tests that failed due to early errors into their own test suite. We then ensured that we had failure tests for each and every early error in the specification. We then used our new Shift reducer 2.0 to reduce the AST in a single pass into a list of early errors. The reduction state object needed fewer states than we had initially thought it would. It tracks label names, break/continue statements, labeled break/continue statements, new.target expressions, names in binding position, lexical declarations, vardeclarations, var declarations in for-of heads, function declarations, exported names, exported bindings, super calls, and super member accesses. All early errors relate to one or more of those pieces of information and some context.

There weren’t really many difficulties with this approach. There were a few cases that we labeled as “early grammar errors”, which were grammatically correct productions that our AST could not represent: 0 = 0because AssignmentExpression requires a Binding on its left-hand side, (...a) because ArrowExpression must have a body, etc. Additionally, we labeled some early errors as “early tokenisation errors”, including a\u0000, an identifier with a unicode escape sequence whose value is disallowed by an early error.

What We’ve Gained

So was all this trouble worth it? We certainly think it was! First off, the parser is much cleaner, easier to read, and easier to maintain. But if we wanted to, we wouldn’t even have to maintain the parser. Now that there’s no special grammar exceptions, we can replace the entire hand-written parser with one generated by a parser generator. This should be even more easily maintainable and more performant.

Speaking of performance, the parser with early errors disabled is over twice as fast as it was before. In fact, it is faster than any other parser we tested. For our business use, we will almost always want to run in this “loose” mode – we don’t have to worry about being very strict with the inputs we accept. Now one can choose between extremely correct or extremely fast parsing (but even in fast mode, you will never fail to parse a valid program).

Error reporting has also improved with the two-phase parser, since the separate early error checker will collect and report all problems due to early errors in a syntactically valid program instead of just reporting the first one. Finally, this opens up the possibility for the first phase to be extensible, allowing one to add support for JSX, macros, etc. through plugins instead of forks!

Summary

We took a pretty big risk with an experimental new parsing strategy, and it has paid off. We’re pretty excited here at Shape about the new possibilities this has opened for us, and will be working with the maintainers of other popular parsers to popularise this approach.

Browser Compatibility Testing with BrowserStack and JSBin

There are some great services out there that go a long way to outline browser compatibility of JavaScript, HTML, and CSS features. Can I Use? is one of the most thorough, and kangax’s ES5 and ES6 compatibility table are extremely valuable and regularly updated. The Mozilla Developer Network also has a browser support matrix for almost every feature that is documented, but it is a public wiki and a lot of the information comes from the previously mentioned sources. Outside of those resources, the reliability starts dropping quickly.

What if the feature you’re looking for isn’t popular enough to be well documented? Maybe you’re looking for confirmation of some combination of esoteric behaviors that isn’t documented anywhere?

BrowserStack & JSBin

BrowserStack has a service that allows you to take screenshots across a massive number of browsers. Services like these are often used to quickly verify that sites look the same or similar across a wide array of browsers and devices.

While that may not sound wildly exciting for our use case, we can easily leverage a service like jsbin to whip up a test case that exposes the pass or failure in a visually verifiable way.

Using a background color on the body (red for unsupported, green for supported) we can default an html body to the ‘unsupported’ class and, upon a successful test for our feature, simply change the class of the body to ‘supported’. For example, we recently wanted to test the browser support for document.currentScript, a feature that MDN says is not well supported but according to actual, real world chrome and IE/spartan developers, is widely supported and has been for years!

Well we had to know, so we put together a jsbin to quickly test this feature.


<html>
<head>
  <meta charset="utf-8">
  <title>JS Bintitle>
  <style>
    .unsupported{
      background-color:red;
    }
    .supported{
      background-color:green;
    }
  style>
head>
<body class=unsupported>
  <script id=foo>script>
  <script id=target>
    if (document.currentScript && document.currentScript.id === 'target')
      document.body.className = 'supported';
  script>
  <script id=bar>script>
body>
html>

After the bin is saved and public, we select a wide range of browser we wanted to test in browserstack and pass it the output view of the JSBin snippet. In this case we wanted to test way back to IE6, a range of ios and android browsers, opera, safari, and a random sampling of chrome and firefox.

browserstackss

After a couple minutes, we get our results back from BrowserStack and we can immediately see what the support matrix looks like.

browserstackresults

This binary pass/fail approach may not scale too well but you could get creative and produce large text, a number, or even an image that is easily parsable at varying resolutions, like a QR code, to convey test results. The technique is certainly valuable for anyone that doesn’t have an in-house cluster of devices, computers, and browsers at their disposal simply for browser testing. Credit goes to Michael Ficarra (@jspedant) for the original idea.

Oh, and by the way, document.currentScript is definitely not widely supported.

Reducing with the Shift Reducer

What is a Reducer?

A reducer is an actor that takes something large and turns it into something smaller. In programming it is a construct that recursively applies a function over a data structure in order to produce a single value.

In JavaScript, you could reduce an array of integers to a single sum with the following code.

var integers = [1, 2, 3, 4, 5, 6];
var sum = integers.reduce(function(memo, next){ return memo + next; }, 0);
// sum === 21

Shift Reducer

Shape Security has provided a reducer to use in building tooling for the Shift format AST. The reducer folds a Shift format AST into a summary value, much like Array.prototype.reduce folds an array. Of course, reducing an array is much less complex than reducing an AST. Only one function is required to reduce an array, while reducing an AST requires one function for each different type of node.

Shape’s reducer library exposes a single function that runs the reduction, and two base structures that are meant to be extended with your own reducing behaviors: Reducer and MonoidalReducer.

Reducer

Use Reducer when the reduction requires unique behavior for each different type of node. It is a clean slate. Extending Reducer requires every single reduction method (reduceXXX) to be overridden. Code generation or AST serialisation are examples of when it is appropriate to base your reducer on Reducer.

MonoidalReducer

The majority of Shift implementations will benefit from basing their reducer off of MonoidalReducer. Extending MonoidalReducer requires that the summary value that each reduction method returns is a Monoid. Its default implementations of the reduction methods take advantage of the monoidal structure of your summary value so that only the reduction methods for the pertinent nodes need to be overridden by you. For all others, the Monoid’s identity will be used.

That may have been a lot to take in. Don’t worry if you’re not familiar with the terminology! As a programmer, you likely run into Monoids every single day, but the term can cause confusion. Let’s see if we can clear up the term a little bit.

Monoids

A monoid is structure that relates the elements in a set with a closed, associative, binary operation that we will call append, coupled with one special element in that set that we will call the identity element.

Let’s break monoids down a little further.

What is a binary operation?

A binary operation is an operation that operates on two inputs.

0 + 1; // + is a binary operator
function append(a, b){} // append is a binary function

What is a closed operation?

An operation is closed if performing the operation on members of a set always produces a member of the same set.

What is associativity?

We learned the concept of associativity back in elementary school and it is core to our understanding of algebraic operations. Associativity, as the name implies, means that grouping of operations does not affect the result.

(a + b) + c === a + (b + c)

Remember, associativity is not commutativity. That would mean that the order of the values given to the operation does not affect the result.

a + b + c === c + b + a

What is an identity?

An identity is a value for a specific operation that when passed to an operator with any other value returns the other value. You may remember this via the additive identity, 0, or the multiplicative identity, 1.

x + 0 === x;
0 + x === x;

x * 1 === x;
1 * x === x;

append(x, identity) === x;
append(identity, x) === x;

Putting it all together

Using the above examples we could write out the Sum Monoid in arithmetic expressions.

sumIdentity = 0
sumAppend(x, y) = x + y

Or we could write the Sum Monoid JavaScript implementation. For this, we will use the conventional Fantasy Land names, empty and concat.

//es6
class Sum {
  constructor(number) {
    this.value = number;
  }
  // always return the identity element
  static empty() {
    return new Sum(0);
  }
  // the binary operation acts on its `this` value and its parameter
  concat(other) {
    return new Sum(this.value + other.value);
  }
}

new Sum(5).concat(new Sum(2)).value; // 7

Walkthrough: Making something with the MonoidalReducer

Now that we understand monoids, let’s walk through making a small program with the Shift MonoidalReducer that counts how many identifiers are in a program.

Setup

Install dependencies.

$ npm install --save shift-reducer shift-parser 6to5

Making an Identifier counter

First we need to flesh out our basic program.

//es6
import parse from "shift-parser";
import reduce, {MonoidalReducer} from "shift-reducer";

// a monoid over integers and addition
class Sum() {
  constructor(number) {
    this.value = number;
  }
  // by default reduce any node to the identity, zero
  static empty() {
    return new Sum(0);
  }
  // combine Sum instances by summing their values
  concat(other) {
    return new Sum(this.value + other.value);
  }
}

class IdentifierCounter extends MonoidalReducer {
  constructor() {
    // let MonoidalReducer know that we're going to use Sum as our monoid
    super(Sum)
  }

  // a convenience function for performing the reduction and extracting a result
  static count(program) {
    return reduce(new this, program).value;
  }

  // add 1 to the count for each IdentifierExpression node
  reduceIdentifierExpression(node) {
    return new Sum(1);
  }

  /*
    In this case, the only node we care about overriding is the
    IdentifierExpression node; the rest can be reduced using the default
    methods from MonoidalReducer.
  */
}

// test program code
var program = "function f() { hello(world); }";
console.dir(IdentifierCounter.count(parse(program)));

Run it!

$ node_modules/.bin/6to5-node count-identifiers.js

Wrapping Up

Let’s walk through what’s been done. We’ve created a new Reducer by extending the MonoidalReducer, overridden the necessary reduction methods (in this case only reduceIdentifierExpression), and parsed and run our new reducer over a program.

We wrote this example in ES6 because we believe it’s clearer. An ES5 version of the identifier counter is available in this gist.

Taking it Further

At this point, we’ve used a fairly trivial example in order to expose the fundamentals of using the MonoidalReducer. Next, we will look at the design of a more significant project that makes use of theMonoidalReducer: the Shift Validator.

Shift Validator

The Shift Validator validates a Shift format AST from the bottom up. But how does it do this when many of the restrictions it is enforcing are context sensitive? The ValidationContext object that the Validator uses allows possible errors to be registered on it and, if we determine (with new information about the possible error’s context) that the error actually does not apply, it can clear possible errors as well. Only when we are certain an error will not be cleared do we move it from its temporary error list to the official errors list in the ValidationContext object. Let’s look at a concrete example:

When the Validator reduces a ReturnStatement, we call the addFreeReturnStatement helper method of our ValidationContext state object, giving it an error that this ReturnStatement must be contained within a function (top-level return is illegal in JavaScript). We don’t know whether this ReturnStatement is actually in an illegal position, but we assume it is until we better understand its context. In the reduction methods for FunctionDeclaration, FunctionExpression, Getter, and Setter nodes, we then call the clearFreeReturnStatements helper method of our ValidationContext state object clearing out all of the ReturnStatement errors we collected while reducing ReturnStatement nodes below us in the AST. Finally, when we reduce a Script node (the head of the AST), we move the ReturnStatement errors from their temporary holding list to the confirmed errors list using theenforceFreeReturnStatementErrors helper method. We do this at this point because we know we won’t be reducing any more functions that will cancel out a ReturnStatement error.

Final Round Up

To pull it all together, we’ve gone over the Shift Reducer and MonoidalReducer. Both can be used to build tooling based on the Shift AST. We’ve gone over the fundamentals behind the MonoidalReducer and explored both a simple MonoidalReducer example, as well as a more complex example, the Shift Validator. Hopefully, now you feel comfortable building your own tools based on Shift’s AST.