It just became harder to distinguish bot behavior from human behavior.
Inbar Raz, Principal Researcher, PerimeterX, also contributed to this commentary.
Ask just about anyone the question “What distinguishes an automated (bot) session from a human-driven session?” and you'll almost always get the same first answer: “Speed.” And no wonder - it's our first intuition. Computers are just faster.
If you focus the question on credential brute-forcing, then it's even more intuitive. After all, the whole purpose of a brute-force attack is to cover as many options as possible, in the shortest possible time. Working quickly is just elementary, right?
Well, it turns out that this is not always the case. Most defenders, if not all, are already looking at speed and have created volumetric detections that are nothing more than time-based signatures. And that works, most of the time. But the attackers are getting smarter every day, and changing their attack methods. Suddenly, checking speed is no longer enough.
On the first week of October, we detected a credentials brute-force attack on one of our customers that commenced around 03:30am UTC. The attack, which lasted a few minutes shy of 34 hours, spanned a whopping 366,000 login attempts. Sounds like an easy case - 366K over 34 hours is over 10,000 attempts per hour.
But an easy catch? Not by existing volumetric detections, because the attack did not originate from one single IP address. In fact, we discovered that well over 1,000 different IP addresses participated in this attack. Let's look at the distribution of attempts:
Of all the participating IP addresses, the vast majority (over 77%) of them appeared up to 10 times only, during the entire attack. While the minority may trigger a volumetric detection, 77% percent of the attacking IP addresses would go unnoticed.
One can argue that counting failed login attempts would come in handy here. And it indeed could, except that many of the brute-force attacks don't actually enumerate on passwords tirelessly. Instead, they try username/password pairs that were likely obtained from leaked account databases, gathered from other vulnerable and hacked sites. Since many people use the same password in more than one place, there is a good chance that some, if not many, of the login attempts will actually be successful.
How Motivated Attackers Adapt
On a different attack we observed, nearly 230,000 attempts at logging in over 20 minutes were performed from over 40,000 participating IP addresses. The vast majority of IP addresses were the origin point of 10 or fewer attempts. A volumetric detection would simply miss this attack.
In comparison, a common volumetric detector is usually set to between 5 and 30 as a minimum, depending on the site’s specific behavior. Our data suggest that motivated attackers will adapt and adjust their numbers to your threshold, no matter how low it is. We also observed that the attack was incredibly concentrated within a very short detection window of only about 20 or 25 seconds.
Fake User Creation Attack
Let's look at one last distributed attack, on yet another client. This time, the attack is not about credentials brute-forcing but rather fake user creation. In this example, the largest groups of IP addresses used per attempt count were those that committed only 1 or 2 attempts:
The entire attack was conducted in less than six hours.
How do the attackers get so many IP addresses to attack from? The answer lies in analyzing the IP addresses themselves. Our research shows that 1% were proxies, anonymizers or cloud vendors, and the other 99% were private IP addresses of home networks, likely indicating that the attacks were performed by some botnet (or botnets) of hacked computers and connected devices. Furthermore, the residential IP addresses constantly change (as in any home) rendering IP blacklisting irrelevant, and even harmful for the real users' experience.
We included in this post just a few representative examples (out of many more we detected) of large-scale attacks originating from thousands of IP addresses over a short time span. In the majority of these cases, detection was achieved by examining how users interacted with the website. The suspicious indicators included users accessing only the login page, filling in the username and password too fast or not using the mouse.
The implication of these attacks vary. They include theft of user credentials as well as fake user account creation, which in turn leads to user fraud, spam, malware distribution and even layer-7 DDoS on the underlying web application.
In conclusion, volumetric detections are simple and useful, but they are not sufficient. The attackers continue to improve their techniques, bypassing old-fashioned defenses. The new frontier in defense is in distinguishing bot behavior from human behavior – and blocking the bots.
Inbar Raz has been teaching and lecturing about Internet security and reverse engineering for nearly as long as he’s been doing that himself: He started programming at the age of 9 and reverse engineering at the age of 14. Inbar specializes in outside-the-box approaches to analyzing security and finding vulnerabilities; the only reason he's not in jail right now is because he chose the right side of the law at an earlier age. These days, Inbar is the principal researcher at PerimeterX, researching and educating the public on automated attacks on websites.