Key Findings from the 2018 Credential Spill Report

In 2016 we saw the world come to grips with the fact that data breaches are almost a matter of when, not if, as some of the world’s largest companies announced spills of incredible magnitude. In 2017 and 2018, we started to see regulatory agencies make it clear that companies need to proactively protect users from attacks fueled by these breaches as they show little sign of slowing.

In the time between Shape’s inaugural 2017 Credential Spill Report and now, we’ve seen a vast number of new industries roll up under the Shape umbrella and, with that, troves of new data on how different verticals are exploited by attacker—from Retail and Airlines to Consumer Banking and Hotels. Shape’s 2018 Credential Spill Report is nearly 50% larger and includes deep dives on how these spills are used by criminals and how their attacks play out. We hope that the report helps companies and individuals understand the downstream impact these breaches have. Credential stuffing is the vehicle that enables endless iterations of fraud and it is critical to have eyes on the problem as soon as possible. This is a problem that is only getting worse and attackers are becoming more advanced at a rate that is devaluing modern mitigation techniques rapidly.

Last year, over 2.3 billion credentials from 51 different organizations were reported compromised. We saw roughly the same number of spills reported each of the past 2 years, though the average size of the spill decreased slightly despite having a new record breaking announcement reported by Yahoo. Even after excluding Yahoo’s update from the measurements in 2017, we saw an average of 1 million credentials spilled every single day.

These credential spills will affect us for years and, with an average time of 15 months between a breach and the report, attackers are already well ahead of the game before companies can even react to being compromised. This window of opportunity creates strong motives for criminals, as evidenced by the e-commerce sector where 90% of login traffic comes from credential stuffing attacks. The result is that attacks are successful as often as 3% of the time and the costs can quickly add up for businesses. Online retail loses about $6 billion per year while the consumer banking industry faces over $50 million per day in potential losses from attacks.

2017 also gave us many credential spills from smaller communities – 25% of the spills recorded were from online web forums. These spills did not contribute the largest number of credentials but their presence underlines a significant and important role in how data breaches occur in the first place. Web forums frequently run on similar software stacks and often do not have IT teams dedicated to keeping that software up-to-date as a top priority. This makes it possible for one vulnerability to affect many different properties with minimal to no retooling effort. Simply keeping your software up to date is the easiest way to protect your company and services from being exploited.

As a consumer, the advice is always the same: never reuse your passwords. This may seem like an oversimplification but it is the 100% foolproof way to ensure that any credential spill doesn’t leave you open to a future credential stuffing attack. Data breaches can still affect you in different ways depending on the details of the data that was exfiltrated, but credential stuffing is the trillion dollar threat and you can sidestep it completely by ensuring every password is unique.

As a company, protecting your users against the repercussions of these breaches is becoming a greater priority. You can get a pretty good idea of whether or not you may already have a problem by monitoring the patterns of your login success rate compared to daily traffic patterns. Most companies and websites have a fairly constant percentage of login success and failures, if you see deviations that coincide with unusual traffic spikes you are likely already under attack. Of course, Shape can help you identify this traffic with greater detail but it’s important to get a handle on this problem regardless of the vendor – we all win if we disrupt criminal behavior that puts us all at risk. As part of our commitment to do this ourselves, Shape also released its first version of Blackfish, a collective defense system aimed at sharing alerts of credential stuffing attacks within Shape’s defense network for its customers. This enables companies to preemptively devalue a credential spill well before it has even been reported.

You can download Shape’s 2018 Credential Spill report here.

Please feel free to reach out to us over twitter at @shapesecurity if you have any feedback or questions about the report.

Introducing Blackfish, a system to help eliminate the use of stolen passwords

Today we’re releasing Blackfish, a system that proactively protects companies from credential stuffing before an attack takes place. Normally, credential stuffing starts with a data breach at one major company (“Initial Victim”), and continues when a criminal then uses the stolen data (usernames and passwords) against dozens or even hundreds of different companies (“Downstream Victims”). Usually, many months or years pass before the Initial Victim realizes and discloses the initial data breach, and in that time, criminals are able to successfully attack huge numbers of Downstream Victims. Later, once the Initial Victim does disclose the breach, the Downstream Victims start matching the username/password pairs from the Initial Victim against their own user databases, and resetting any passwords that match. The whole process can take years and results in hundreds of millions of dollars worth of fraud and brand damage.

Blackfish changes all that. From the very first moment a criminal attempts to use stolen usernames and passwords, Blackfish begins monitoring and protecting matching accounts at other companies. So, while under normal circumstances a criminal can get hundreds of chances to monetize the stolen usernames and passwords, with Blackfish in place, criminals get far fewer chances.

You may be wondering how Blackfish can accomplish all this. Explaining that requires a little background on Shape Security.

We founded Shape six years ago to answer a simple question: is a visitor to a web or mobile app an actual human being? This simple question proved to be an important one. As we perfected our ability to answer it, we started eliminating enormous amounts of fraudulent traffic from the largest web and mobile apps in the world — often 90% or more of the login traffic from a Fortune 100 web application.

Today, we are the primary line of defense for many of the largest organizations around the world. Our customers include: three of the top four banks, three of the top five airlines, two of the top three hotel chains, and numerous other leading companies and government agencies.

We secure all of those large organizations in a centralized way, directly delivering the security outcome of eliminating fraudulent traffic. That centralized security capability is also the heart of Blackfish, and allows Blackfish to see stolen usernames and passwords in use far before anyone else ever knows about them (including the Initial Victim).

Think about it: if you were a criminal and managed to steal all the usernames and passwords from a major corporation, where would you try them out? If you’re like most criminals, the answer is that you’d try them on the largest banks, airlines, hotels, and retail sites in the world. That’s what happens in practice, and when it does, that’s also when Blackfish sees the very first such attack, and sets about protecting all username/password pairs that happen to match on other large websites.

Blackfish does all this before the original data breach is reported or even detected by the Initial Victim company.

The problem with looking for credentials on the dark web

You can scour the dark web to find user credentials, but one of the greatest dangers companies face today is the long window of time between when breaches occur on third-party websites like Yahoo, and when those breaches are discovered and announced. Instead of hoping that stolen passwords will appear in the dark web in time to be useful, Blackfish autonomously detects credential stuffing attacks on the largest, most targeted websites in the world, identifies newly stolen credentials, and nullifies them globally. That stolen data becomes useless to cybercriminals.

How does it work?

Shape has grown into one of the largest processors of login traffic on the entire web. We have built machine learning and deep learning systems to autonomously identify credential stuffing attacks in real-time. These systems now generate an important byproduct: direct knowledge of stolen usernames and passwords when criminals are first starting to exploit them against major web and mobile apps. What this means is that we see the stolen assets months or years before they appear on the dark web.

Blackfish’s knowledge base of compromised credentials is built with maximum security in mind. To ensure that its knowledge base is secured, Blackfish does not store any credential information but instead leverages Bloom filters to create probabilistic data structures to perform its operations. As a result, the compromised credentials themselves are not stored anywhere and Blackfish can use the information about compromises to improve security while maintaining full data privacy.

What good is a stolen password if you can never use it?

For better or for worse, memorized secrets (a.k.a. “passwords”) are the most widely used authentication mechanism online. As such, having access to millions of stolen passwords (over 3.3 billion were reported stolen in 2016 alone) allows cybercriminals to easily take over users’ accounts on any major website. They do this with credential stuffing attacks, which take stolen passwords from website A and try them on website B to see which accounts the same email addresses and passwords will unlock. Cybercriminals can do this reliably with a typical 1-2% success rate, allowing them to seize the value in bank accounts, gift card accounts, airline loyalty programs, and other accounts, which they can then monetize for a predictable ROI.

Since credential stuffing attacks are responsible for more than 99.9% of account takeover attempts, if we identify the stolen credentials that are used in these attacks, and invalidate them across other websites, we change the economics for cybercriminals significantly. If their 1-2% success rate now drops by two orders of magnitude or more, their “business” no longer functions. At that point, the cybercriminal has no choice but to try to obtain new stolen passwords. If those new passwords are similarly detected and invalidated, it will become clear to the criminals that the economics of their scheme have been broken. We think that over time, Blackfish will end credential stuffing for everyone.

We are all very excited at Shape to announce this system and our vision to make credential stuffing attacks a thing of the past. You can learn more on our website and contact us when your company is ready to try Blackfish.

Announcing the Shift JavaScript AST Specification

In time for the holidays, we are happy to release Shape Security’s first open source contributions: a new JavaScript AST specification named Shift, and a suite of tools to help you get started working with it.

What is an AST?

An Abstract Syntax Tree is simply a tree representation of a program’s source code. The nodes in an AST represent individual aspects of the language such as identifiers, statements, and literals. This structure is commonly the result of a successful parse of source code.

What can I do with it?

Having an easy to use data structure that represents a program’s source code allows you to write programs that treat code as they would any other piece of data. You can reliably generate new source, transform between languages, replace subtrees, analyze, lint, and auto-format code. ASTs are used by anything that needs to operate on code: IDEs, parsers, linters, analyzers, optimizers, compilers, and more. AST formats that are publicly standardized enable developers to centralize their efforts over a common structure, reducing duplicate work and allowing tools to be composed together.

This doesn’t exist already?

Mozilla exposed the SpiderMonkey Reflect.parse API in 2010 to encourage better tooling for JavaScript. This proved to be incredibly useful to the JavaScript community, enabling the creation of parsers like Esprima and Acorn and catalyzing a vast ecosystem of tools. Hundreds of projects rely upon these tools, including eslint, plato, istanbul, jscs, browserify, and many more.

However, the SpiderMonkey AST format was not specifically created for these tools. The SpiderMonkey AST originated as the internal representation of a JavaScript program in the SpiderMonkey engine, which was intended to be used only for interpretation. As tools were created and more use cases for a standard AST were recognized, many difficulties in dealing with SpiderMonkey format ASTs surfaced.

The SpiderMonkey AST and its ecosystem of tools and parsers is formidable and we don’t take deviation lightly. Our work at Shape Security has presented us with many problems that involve deep analysis and transformation of JavaScript. We have been forced to rethink what it means to represent and transform a JavaScript program, and in doing so developed this alternative AST format. The main advantages of using the Shift AST format are that it makes it much more difficult to accidentally perform a transformation that creates an invalid AST, and the nodes align more closely to the syntactic features they represent.

More than just the AST

An AST specification doesn’t have much value without a surrounding ecosystem. We’ve open-sourced JavaScript and Java implementations of the foundational tooling necessary to foster development of a supporting ecosystem around the Shift AST format. The following tools have been made available for both environments.

  • AST Node Constructors
  • Parser
  • Code Generator
  • Reducer
  • Validator
  • Scope Analyzer

In addition, we’ve released a tool for converting back and forth between the Shift and SpiderMonkey AST formats. All of these are available on the Shape Security Github account.

The road forward

We will continue to develop tooling based on the Shift AST format and will iterate on the existing libraries, optimize for performance, and add ECMAScript 6 support.

The Shift AST format was developed with ECMAScript 6 in mind. The es6 branches of both the specification and the JavaScript AST constructors already include full support for ECMAScript 6, and we plan to add support to all of the tooling we have released so far. Contributors

Some of the developers behind the Shift AST format and associated tools are active contributors and maintainers of JavaScript language tools that are popular in the JavaScript community. Work on those tools is not ending, nor does the work here immediately affect any future plans for those tools.