Announcing SuperPack

Shape Security is proud to announce the release of SuperPack, a language-agnostic schemaless binary data serialisation format.

First of all, what does it mean to be schemaless?

Data serialisation formats like JSON or MessagePack encode values in a way that the structure of those values (schema) can be determined by simply observing the encoded value. These formats, like SuperPack, are said to be “schemaless”.

In contrast, a schema-driven serialisation format such as Protocol Buffers makes use of ahead-of-time knowledge of the schema to pack the encoded values into one exteremely efficent byte sequence free of any schema markers. Schema-driven encodings have some obvious downsides. The schema must remain fixed (ignoring versioning), and if the encoding party is not also the decoding party, the schema must be shared among them and kept in sync.

Choose the right tool for the job. Usually, it is better to choose a schema-driven format if it is both possible and convenient. For other occasions, we have a variety of schemaless encodings.

What separates it from the others?

In short, SuperPack payloads are very compact without losing the ability to represent any type of data you desire.


The major differentiator between SuperPack and JSON or bencode is that it is extensible. Almost everyone has had to deal with JSON and its very limited set of data types. When you try to JSON serialise a JS undefined value, a regular expression, a date, a typed array, or countless other more exotic data types through JSON, your JSON encoder will either give you an error or give you an encoding that will not decode back to the input value. You will never have that problem with SuperPack.

SuperPack doesn’t have a very rich set of built-in data types. Instead, it is extensible. Say we wanted to encode/decode (aka transcode) regular expressions, a data type that is not natively supported by SuperPack. This is all you have to do:

  // extension point: 0 through 127
  // detect values which require this custom serialisation
  x => x instanceof RegExp,
  // serialiser: return an intermediate value which will be encoded instead
  r => [r.pattern, r.flags],
  // deserialiser: from the intermediate value, reconstruct the original value
  ([pattern, flags]) => RegExp(pattern, flags),

And if we want to transcode TypedArrays:

  a => [a[Symbol.toStringTag], a.buffer],
  ([ctor, buffer]) => new self[ctor](buffer),


The philosophy behind SuperPack is that, even if you cannot predict your data’s schema in advance, the data likely has structures or values that are repeated many times in a single payload. Also, some values are just very common and should have efficient representations.

Numbers between -15 and 63 (inclusive) are a single byte; so are booleans, null, undefined, empty arrays, empty maps, and empty strings. Strings which don’t contain a null (\0) character can avoid storing their length by using a C-style null terminator. Boolean-valued arrays and maps use a single bit per value.

When an encoder sees multiple strings with the same value, it will store them in a lookup table, and each reference will only be an additional two bytes. Note that this string deduplication optimisation could have been taken further to allow deduplication of arbitrary structures, but that would allow encoders to create circular references, which is something we’d like to avoid.

When an encoder sees multiple maps with the same set of keys, it can make an optional optimisation that is reminiscent of the schema-directed encoding approach but with the schema included in the payload. Instead of storing the key names once for each map, it can use what we call a “repeated keyset optimisation” to refer back to the object shape and encode its values as a super-efficient contiguous byte sequence.

The downside of this compactness is that, unlike JSON, YAML, or edn, SuperPack payloads are not human-readable.


After surveying existing data serialisation formats, we knew we could design one that would be better suited to our particular use case. And our use case is not so rare as to make SuperPack only useful to us; it is very much a general purpose serialisation format. If you want to create very small payloads for arbitrary data of an unknown schema in an environment without access to a lossless data compression algorithm, SuperPack is for you. If you want to see a more direct comparison to similar formats, see the comparison table in the specification.

I’m sold. How do I use it?

As of now, we have an open-source JavaScript implementation of SuperPack.

$ npm install --save superpack

Pokémon Go API – A Closer Look at Automated Attacks

Tens of millions of people are out exploring the new world of Pokémon Go. It turns out that many of those users are not people at all, but automated agents, or bots. Game-playing bots are not a new phenomenon, but Pokémon Go offers some new use cases for bots. These bots have started interfering with everyone’s fun by overwhelming Pokémon Go servers with automated traffic. Pokémon Go is a perfect case study in how automated attacks and defenses work on mobile APIs. At Shape we deal with these types of attacks every day, so we thought we would take a closer look at what happened with the Pokémon Go API attacks.

Pokémon Go API Attack

Niantic recently published a blog post detailing the problems bots were creating through the generation of automated traffic, which actually hindered their Latin America launch. The chart included in the post depicts a significant spatial query traffic drop since Niantic rolled out countermeasures for the automation at 1pm PT 08/03. The automated traffic appears to have been about twice that of the traffic from real human players. No wonder Pokémon Go servers were heavily overloaded in recent weeks.

server_resourcesFigure 1. Spatial query traffic dropped more than 50% since Niantic started to block scrapers. Source: Niantic blog post

Getting to Know The Pokémon Bots

There are two types of Pokémon bots. The first type of bot automates regular gameplay and is a common offender on other gaming apps, automating activities such as walking around and catching Pokémon. Examples of such bots include MyGoBot and PokemonGo-Bot. But Pokémon Go has inspired the development of a new type of bot, called a Tracker or Mapper, which provides the location of Pokémon. These bots power Pokémon Go mapping services such as Pokevision and Go Radar.

How a Pokémon Go Bot Works

A mobile API bot is a program that mimics communication between a mobile app and its backend servers—in this case servers from Niantic. The bot simply tells the servers what actions are taken and consumes the server’s response.

Figure 2 shows a screenshot of a Pokémon Go map which marks nearby Pokémon within a 3-footstep range of a given location. To achieve this, the bot makers usually follow these steps:

  1. Reverse-engineer the communication protocol between the mobile app and the backend server. The bot maker plays the game, captures the communications between the app and its server, and deciphers the protocol format.
  2. Write a program to make series of “legitimate” requests to backend servers to take actions. In this case, getting locations of nearby Pokémon is a single request with a targeted GPS coordinate, without the real walk to the physical location. The challenge to the bot is to bypass a server’s detection and look like a real human.
  3. Provide related features such as integration with Google Maps, or include the bot’s own mapping functionality for the user.

pokemon-map.pngFigure 2. Screenshot of a Pokémon Go map

Mobile App Cracks and Defenses

Using Pokémon Go app as an example, let’s examine how a mobile app is cracked by reverse engineering to reveal its secrets. Since attackers mainly exploited Pokémon Go’s Android app, let’s focus on Android app cracks and defenses.

Reverse-Engineering the Protocol

The Pokémon Go app and the backend servers communicate using ProtoBuf over SSL. ProtoBuf defines the data format transferred on the network. For example, here is an excerpt of the ProtoBuf definition for player stats:

message PlayerStats {
  int32 level = 1;
  int64 experience = 2;
  int64 prev_level_xp = 3;
  int64 next_level_xp = 4;
  float km_walked = 5;
  int32 pokemons_encountered = 6;
  int32 unique_pokedex_entries = 7;

Pokémon Go was reverse-engineered and published online by POGOProtos within only two weeks. How did this happen so quickly? Initially, Niantic didn’t use certificate pinning.

Certificate pinning is a common approach used against Man-in-the-Middle attacks. In short, a mobile app only trusts server certificates which are embedded in the app itself. Without certificate pinning protection, an attacker can easily set up a proxy such as Mitmproxy or Fiddler, and install the certificate crafted by the attacker to her phone. Next she can configure her phone to route traffic through the proxy and sniff the traffic between the Pokémon Go app and the Niantic servers. There is actually a Pokémon Go-specific proxy tool that facilitates this called pokemon-go-mitm.

On July 31, Niantic made a big change on both its server and its Pokémon Go app. Pokémon Go 0.31.0 was released with certificate pinning protection. Unfortunately, the cat was out of the bag and the communication protocol was already publicly available on GitHub. In addition, implementing certificate pinning correctly is not always easy. In the later sections, we will cover some techniques commonly used by attackers to bypass certificate pinning.

APK Static Analysis

The Android application package (APK) is the package file format used by Android to install mobile apps. Android apps are primarily written in Java, and the Java code is compiled into dex format and built into an apk file. In addition, Android apps may also call shared libraries which are written in native code (Android NDK).

Dex files are known to be easily disassembled into SMALI languages, using tools such as Baksmali. Then tools such as dex2jar and jd-gui further decompile the dex file into Java code, which is easy to read. Using these techniques, attackers decompiled the Pokémon Go Android app (version 0.29.0 and 0.31.0) into Java code. The example code shown below implements certificate pinning from the class.

public void checkServerTrusted(X509Certificate[] chain, String authType) throws CertificateException {
  synchronized (this.callbackLock) {
    nativeCheckServerTrusted(chain, authType);

When application source code is exposed, reverse engineering becomes a no-brainer. Pokemon Go Xposed used less than 100 lines of Java code to fool the Pokémon Go app into believing the certificate from the MITMProxy was the authentic certificate from Niantic.

How did Pokemon Go Xposed achieve this? Quite easily. The tool simply hooks to the call of the function checkServerTrusted mentioned in the above code snippet. The hook changes the first parameter of the function, chain, to the value of Niantic’s certificate. This means that no matter what unauthorized certificate the proxy uses, the Pokémon Go app is tricked into trusting the certificate.

There are many tools that can help make disassembly and static analysis by attackers more difficult. ProGuard and DexGuard are tools that apply obfuscation to Java code and dex files. Obfuscation makes the code difficult to read, even in decompiled form. Another approach is to use Android packers to encrypt the original classes.dex file of Android apps. The encrypted dex file is decrypted in memory at runtime, making static analysis extremely hard, if not impossible, for most attackers. Using a native library is another way to significantly increase the difficulty to reverse-engineer the app.

Reverse-Engineering the Native Library

The most interesting cat-and-mouse game between the pokemongodev hackers and Niantic was around the field named “Unknown6”, which was contained in the signature sent in the map request to get nearby Pokémon at a location. “Unknown6” is one of the unidentified fields in the reverse-engineered protobuf. Initially, it wouldn’t matter what value Unknown6 was given; Niantic servers just accepted it. Starting at 1pm PT on 08/03, all Pokémon Go bots suddenly could not find any Pokémon, which eventually resulted in the significant query drop in Figure 1.

The hackers then noticed the importance of the “Unknown6” field in the protocol, and initially suspected Unknown6 to be some kind of digest or HMAC to validate the integrity of the request. This triggered tremendous interest from the pokemongodev community and an “Unknown6” team was quickly formed to attempt to crack the mysterious field. The Discord channel went private due to the wide interest from coders and non-programmers, but a live update channel kept everybody updated on the progress of the cracking effort. After 3 days and 5 hours, in the afternoon of 08/06, the Unknown6 team claimed victory, releasing an updated Pokémon Go API that was once again able to retrieve nearby Pokémon.

While the technical writeup of the hack details has yet to be released, many relevant tools and technologies were mentioned on the forums and the live update. IDA-Pro from Hex-Rays is a professional tool that is able to disassemble the ARM code of a native library, and the new Hex-Rays decompiler can decompile a binary code file into a C-style format. These tools allow attackers to perform dynamic analysis, debugging the mobile app and its libraries at run time. Of course, even with such powerful tools, reverse-engineering a binary program is still extremely challenging. Without any intentional obfuscation, the disassembled or decompiled code is already hard to understand, and the code size is often huge. As an illustration of the complex and unpredictable work required, the live update channel and a subsequent interview described how the encryption function of “Unknown6” was identified within hours but the team spent an extensive amount of additional time analyzing another field named “Unknown22”, which turned out to be unrelated to Unknown6.

As a result, obfuscation still has many practical benefits for protecting native libraries. A high level of obfuscation in a binary may increase the difficulty of reverse-engineering by orders of magnitude. However, as illustrated by the many successful cracks of serial codes for Windows and Windows applications, motivated mobile crackers are often successful.

Server Side Protection

Server-side defenses work in a completely different way than client-side defenses. Here are some of the techniques used in the context of protecting Pokémon Go’s mobile API.

Rate limiting

Rate limiting is a common approach to try to stop, or at least slow down, automated traffic. In the early days, Pokémon scanners were able to send tens of requests per second, scan tens of cells, and find every Pokémon.

On 07/31, Niantic added rate limiting protections. If one account sent multiple map requests within ~5 seconds, Niantic’s servers would only accept the first request and drop the rest. Attackers reacted to these rate limits by: a) Adding a delay (5 seconds) between map requests from their scanning programs b) Using multiple accounts and multiple threads to bypass the rate limit

In the case of Pokémon Go, the use of rate-limiting just opened another battleground for automated attacks: automated account creation. They quickly discovered that while rate limiting is a fine basic technique to control automation from overly aggressive scrapers or novice attackers, it does not prevent advanced adversaries from getting automated requests through.

IP Blocking

Blocking IPs is a traditional technique used by standard network firewalls or Web Application Firewalls (WAFs) to drop requests from suspicious IPs. There are many databases that track IP reputation and firewalls and WAFs can retrieve such intelligence periodically.

In general, IP-based blocking is risky and ineffective. Blindly blocking an IP with a large volume of traffic may end up blocking the NAT of a university or corporation. Meanwhile, many Pokémon bots or scanners may use residential dynamic IP addresses. These IPs are shared by the customers of the ISPs, so banning an IP for a long time may block legitimate players.

Hosting services such as Amazon Web Services (AWS) and Digital Ocean are also sources for attackers to get virtual machines as well as fresh IPs. When attackers use stolen credit cards, they can even obtain these resources for free. However, legitimate users will never use hosting services to browse the web or play games, so blocking IPs from hosting services is a safe defense and is commonly used on server side. Niantic may decide to ban IPs from AWS according to this forum post.

Behavior Analysis

Behavior analysis is usually the last line of defense against advanced attackers that are able to bypass other defenses. Bots have very different behaviors compared to humans. For example, a real person cannot play the game 24×7, or catch 5 Pokémon in one second. While behavioral analysis sounds a promising approach, building an accurate detection system to handle the huge data volume like Pokémon Go isn’t an easy task.

Niantic just implemented a soft ban on cheaters who use GPS spoofing to “teleport” (i.e., suddenly moving at an impossibly fast speed). It was probably a “soft ban” because of false positives; families share accounts and GPS readings can be inaccurate, making some legitimate use cases seem like bots.

On around Aug 12 2016, Niantic posted a note on its website, and outlined that violation of its terms of service may result in a permanent ban on a Pokémon Go account. Multiple ban rules targeting bots were also disclosed unofficially. For example, the Pokemon over-catch rule bans accounts when they catch over a thousand Pokemon in a single day. In addition Niantic encourages legitimate players to report cheaters or inappropriate players.

In our experience, behavioral modeling-based detection can be extremely effective but is often technically or economically infeasible to build in-house. As Niantic commented in their blog post, “dealing with this issue also has opportunity cost. Developers have to spend time controlling this problem vs. building new features.” The bigger issue is that building technology to defend against dedicated, technically-savvy adversaries armed with botnets and other tools designed to bypass regular defenses, requires many highly specialized skillsets and a tremendous development effort.

The Game Isn’t Over

As Pokémon Go continues to be loved by players, the game between bot makers and Niantic will also continue. Defending against automated traffic represents a challenge not only for gaming but for all industries. Similar attack and defense activities are taking place across banking, airline, and retailer apps where the stakes are orders of magnitude higher than losing a few Snorlaxes. Bots and related attack tools aren’t fun and games for companies when their customers and users cannot access services because of unwanted automated traffic.

The Half-Day Attack: From Compromise to Cash with Sentry MBA

Sentry MBA-2

Sentry MBA, an automated attack tool used to take over accounts on major websites, makes cybercrime accessible to legions of attackers across the globe. Sentry MBA illustrates the pivotal role automation plays in online attacks and shows how cybercrime is increasingly compartmentalized and commoditized.

Allow me to illustrate with a short story.

Let’s say you’re a would-be cybercriminal looking to make some quick cash. There are many ways to make money on the Internet – especially if you think shoplifting’s a harmless recreational activity – so you hatch a plan to break into your favorite online electronics retailer’s website, order a few televisions, and have them shipped somewhere you can grab them.

But you have a problem: finding website vulnerabilities requires technical skills you just don’t possess. And even if you were a sophisticated cybercriminal, who really wants to spend their valuable time crafting SQL injection or cross-site scripting attacks? It’s far easier to just hijack a few user accounts. The authors of Verizon’s data breach report said as much: “With so many credential lists available for sale or already in the wild, why should a criminal actually earn his/her keep through SQL injection when a simple login will suffice?”

After doing some research, you may stumble across a tool like Sentry MBA. You might not have the technical expertise to research and hand-craft a targeted online exploit, but with Sentry MBA you can launch sophisticated and damaging attacks that are capable of penetrating the defenses employed by major corporations.

It’s a numbers game that works because so many people use the same passwords for multiple online accounts. Any list of stolen credentials will almost certainly include some that allow you to access accounts on the site you’ve targeted. Once you’re in, the retailer is your oyster. You can order any fancy gadget you please with the victim’s stored credit card number, change the ship-to address for your delivery convenience, and resell the goods for cash. Once you’ve maxed out one credit card, just rinse and repeat for all the accounts you cracked.

Sentry MBA automates the process of testing millions, or tens of millions, of username/password combinations to see which ones work. Without automation that task is impossibly time-consuming.

Shape Security protects websites and mobile applications by detecting and preventing automated attacks, including credential stuffing attempts. Shape analyzed a sample of our customer data consisting of six billion login and search page submissions from December of 2015 through January of 2016 and found that Sentry MBA attacks were commonplace. Here are some anonymized examples of the attacks we found:

  • Over one week in December, cybercriminals made over 5 million login attempts at a Fortune 100 B2C website using multiple attack groups and hundreds of thousands of proxies located throughout the world
  • Over two days in January, a large retailer saw two major Sentry MBA attacks with over 20,000 total login attempts
  • During one day in January, a large retailer witnessed over 10,000 login attempts used Sentry MBA and over 1000 proxies
  • Two attacks in December highlight how cybercriminals are turning their attention to mobile APIs. The first attack, focused on the target’s traditional website application, made over 30,000 login attempts using proxies located in eastern Europe. The second attack, focused on the target’s mobile API, made over 10,000 login attempts on a daily basis. Both attacks shared hundreds of IP addresses and other characteristics, indicating the same actors may have been responsible.

By reducing the level of technical skill needed to mount a sophisticated cyberattack, Sentry MBA brings damaging attacks within reach of more and more cybercriminals. The open web and darknet are filled with forums offering working Sentry MBA configuration files for specific sites and credential lists to try. These underground markets, combined with automated tools like Sentry MBA, create a new cybersecurity reality where devastating online attacks can be launched by any individual with minimal resources.

The best way to stop Sentry MBA attacks is to detect and deflect them before they take over accounts through your website or mobile application API. Shape Security protects you and your customers from online fraud committed by cybercriminals using automated attack frameworks, whether they are Sentry MBA or other toolkits.

For an in-depth exploration of Sentry MBA, please see our post from our research team: A look at Sentry MBA.

Announcing Bandolier

Today Shape Security is releasing Bandolier, a Java library that bundles JavaScript written with ES2015 module syntax.

Bandolier takes JavaScript code like this:

import { b } from './foo.js'
console.log(42 + b);

where the foo module is defined as:

// foo.js
export var b = 100;

and produces a single script without ES2015 module syntax that can run in a JavaScript environment that does not yet support import/export:

(function(global) {
  "use strict";

  function require(file, parentModule) {
    // eliding the definition of require
    // ...

  require.define("1", function(module, exports, __dirname, __filename) {
    var __resolver = require("2", module);
    var b = __resolver["b"];
    console.log(42 + b);
  require.define("2", function(module, exports, __dirname, __filename) {
    var b = 100;
    exports["b"] = b;
  return require("1");
}.call(this, this));

Bandolier is a good example of a non-trivial project built using the Shift AST; Bandolier essentially takes a bunch of Module ASTs that contain import and export declarations and appropriately merges them into a single Script AST.

Bandolier works by first parsing the given JavaScript file into a Module AST using the Shift Java parser. It then transforms the AST by resolving each import declaration’s module specifier (e.g. converting import foo from "some/module" to import foo from "/full/path/to/some/module"). Once all the imports are resolved, each imported module is recursively loaded and stored in memory.

Finally, the bundled script is created by generating the module loading boilerplate (the function wrapper and the require function) and then each loaded module is transformed by changing import declarations to require calls and export declarations to updates to the exports object.

One particularly useful feature of Bandolier is that both the resolving and loading phases are pluggable. Bandolier comes with a few choices built-in including:

  • a FileSystemResolver that just normalizes relative paths
  • a NodeResolver that follows the node require.resolve algorithm
  • a FileLoader for loading resources from the file system
  • a ClassResourceLoader for loading resources inside a JAR.

Writing your own custom loader or resolver is as simple as implementing the IResolver and IResourceLoader interfaces.

Note that Bandolier is not a full transpiler like babel; it only transforms import and export statements. That said, the Shift parser fully supports ES2015 so you can, for example, use ES2015 classes and the bundled output will work in any JavaScript environment that supports classes (e.g. recent versions of node).

Also note that Bandolier only bundles ES2015 modules so if you need to do something more complex, like bundling CommonJS modules, you will probably be more happy with something like browserify, CommonJS Everywhere, or webpack.

What sets Bandolier apart from similar projects, and why we built it at Shape, is that it allows you to easily integrate JavaScript bundling into a Java application. We use it to dynamically generate and bundle our JavaScript resources on-the-fly inside a Java server. So, if you have similar needs (or are just interested in how to use the Shift AST) check out the project on github.

We’re back at Fluent this year!

A load of Shapers will be at O’Reilly’s Fluent Conference again this year and we’ll have a lot more in store than we did last year. Ariya and I (Jarrod) will be speaking, we’re sponsoring an event, and we’ll have a booth with (awesome!) prizes, JavaScript trivia, and demos of some really crazy technology. If you’re heading out to SF next week, make sure to look out for all of us and say hi:

At our booth we’ll be giving away a bunch of prizes that will allow you to make or control your own (benevolent) bots – Lego Mindstorms sets, remote control BB-8s, and arduino starter kits.

If you’re at all interested in working with JavaScript in ways you’ve never thought of before, we’re hiring a lot of new positions doing really fun stuff. If you’re curious, please reach out in advance so we can make sure to reserve some time. We really like talking about these things so just give us an excuse and we’ll make time for it.

Definitely make sure to check Ariya’s and my talks at the conference. I’m really excited to give the security talk and go over some of the insanity we get to work with at Shape.

From zero to hero: Toward frontend craftsmanship by Ariya Hidayat

ariyaAfter you have built a nice JavaScript application using Backbone, AngularJS, or React, written some unit tests, integrated a linter, and hooked up a continuous-build system, what should you do next? To reach the next level and create the highest-quality applications, you must first master a few more skills. Ariya Hidayat gives a step-by-step overview of adding code-coverage tracking and a dashboard, utilizing Git hooks to prevent regressions, leveraging Docker for a consistent development platform, and implementing cross-browser testing (including with evergreen browsers).

The Dark Side of Security by Jarrod Overson

jarrodAshley Madison data stolen… breached, passwords need to be reset… 10 million passwords leaked! 13 million! 80 million!

What does this mean to you and your websites? You use secure passwords, your sites haven’t been compromised, and you have safeguards in place to protect your customers, so you don’t need to worry, right?


Jarrod Overson reveals the world where these passwords are traded, sold, verified, and used to exploit your sites. Even if you are diligent, doing everything you can to protect yourself and your users, you can’t protect against legitimate logins. So what can you do? Jarrod explains how you can start exploring how vulnerable you really are, how you might start recognizing malicious traffic, and what you can do to start taking a stand against your attackers.

See you at Fluent!

Avivah Litan at Gartner: Impact of Automated Attacks on B2C Websites


Avivah Litan, Gartner VP and distinguished analyst, is well known for covering big data analytics for cybersecurity & fraud as well as fraud detection & prevention solutions. In this educational webcast, she discusses automated website attacks and their impact on global business to consumer (B2C) brands.

Refer to this link to view the webcast.
Key highlights include:

  • How Gartner defines automated attacks on websites
  • How existing controls, such as device analytics, velocity checks, geolocation, and IP address whitelisting are defeated by attackers
  • How cybercriminals monetize their automated website attacks
  • And, most importantly, how to stop automated attacks