Hackers love automated SQL Injection and Remote File Inclusion attack tools. Using software such as sqlmap, Havij, or NetSparker, finding and exploiting website vulnerabilities is fast and easy even for unskilled attackers.
Hackers favor automated tools for three key reasons. First and foremost, these toools require very little skill to use, and are often freely available — either from hacker forums, or from the sites of developers who create them to be used as legitimate penetration testing tools. Furthermore, they enable a hacker to attack vast numbers of sites very quickly, with very little effort. And lastly, they make efficient use of compromised or rented servers that may be employed from which to launch attacks, and which may only be available for a limited period of time.
But here’s the good news: If you can find a way to spot and block automated attacks, you can cut out a large proportion of hacker activity on your site. In this article, we will explore how to identify the malicious traffic reaching your web applications that has been generated by these automated tools.
Sign # 1: High Incoming Request Rate
One of the key identifiers of an automated attack is the rate at which incoming requests arrive, according to Rob Rachwald, director of security strategy at data security company Imperva. A humans visitor to your website is unlikely to generate more than one HTTP request every 5 seconds, he says. By contrast, automated tools will often issue 70 requests per minute — or more than one per second. “Humans simply can’t work that fast,” Rachwald points out.
At first glance then, it would seem trivial to spot an automated hacker attack: any traffic that arrives at a rate of more than one request every five seconds must be from one of these tools. Unfortunately, things aren’t quite so simple.
First, not all automated traffic is malicious: A significant amount of automated traffic is generated by the likes of Google doing nothing more sinister than indexing your site so that potential customers can find you. And not all traffic that comes in at a high rate is necessarily automated: Services such as content delivery networks and proxies may appear to be the source of high volumes or traffic, but may simply be an agglomeration of many different users.
But perhaps more importantly, many hackers are sophisticated enough to know that generating requests at a high rate is easy to spot, and have a number of tactics to avoid detection. “Hackers can be innovative to avoid detection,” Rachwald warns. Their tactics may include:
- Deliberately slowing or pausing the tool down, to make the traffic patterns it generates look more like that of a human.
- Attacking other sites in parallel. This involves using the automated attack tool to send traffic to a number of different target sites in rotation, so that although the tool generates requests at a very high rate, individual sites only receive traffic at a “human” rate.
- Using multiple hosts from which to launch attacks. This more sophisticated approach enables hackers to attack a site without all the traffic coming from a single and easily identifiable source IP address.
As a result, a high incoming request rate is one clue — but not a positive identifier — of an automated attack. We also have to look for additional clues.
Sign # 2: HTTP Headers
HTTP headers can provide another valuable clue as to the nature of incoming traffic. For example, the automated SQL Injection tools sqlmap, Havij, and NetSparker all correctly identify themselves with descriptive User Agent strings in the HTTP request headers. That’s because these tools are intended to be used for legitimate penetration testing (although malicious hackers use them too). Likewise, attacks initiated from Perl scripts (Perl is a favorite hacker programming language, according to Imperva) may be identified with a “libwww-perl” user agent.
Clearly, any traffic with the names of these tools in the User Agent string should be blocked. These strings can of course be changed, but unskilled “newbie” hackers are often unaware of that method of subterfuge – or even that these strings exist in the first place.
Even if tools don’t contain strings that instantly identify them, Imperva research has found that many of them don’t send the sorts of header information that most conventional browsers would be expected to have in their web requests. These include headers such as the Accept-Language and Accept-Charset headers.
Again, a savvy hacker could configure his system to add these headers — but many don’t. The absence of these Accept headers should certainly raise a warning flag, and in combination with a high request rate this provides a very strong indication that the traffic is malicious.
Sign # 3: Attack Tool Fingerprints
Attack tools carry out various actions according to the way that they have been coded, and ultimately there is a finite range of things that they can do. By analyzing traffic records from what subsequently turns out to be an automated attack it is sometimes possible to find patterns – such as specific strings in generated SQL fragments used in SQL injection — that uniquely identify a particular tool at work, Imperva discovered. (Sometimes these strings can be discovered by inspecting the source code of a tool if it is available.)
These fingerprints can form the basis of firewall blocking rules, but it is important to note that they may change in subsequent versions of the tool.
Sign # 4: Unusual Geographies
Imperva found that some 30 percent of high-rate automated SQL Injection attacks originate from China, and that other attacks originate from “unusual” countries such as Indonesia and Egypt. Rachwald suggests being suspicious of traffic from any country that you don’t expect to get visitors from. “If you are a small retail outlet in London, why are you getting visitors from China?” he says.
A spike in traffic from far-flung geographies is, in and of itself, not proof of anything — but in combination with other signs such as missing Accept headers or a high incoming request rate, this is enough to warrant closer inspection or even blocking the traffic entirely.
Sign # 5: IP Blacklists
Whenever an attack is detected, by any method, the source IP address can be recorded. Imperva’s research team has found that automated attacks from a unique IP address tend to emanate from that single address for an average of between three and five days. However, some IP addresses continue to be the source of malicious automated traffic for weeks or even months. That means blacklisting addresses can be very helpful in preventing further automated attacks from that source. Cloud security providers can leverage the value of this by blacklisting any site that is the source of automated attacks on any of its clients to protect all its other clients.
Paul Rubens is an award-winning technology journalist who has been covering IT security for over 20 years. He has written for leading international publications including The Economist, The Times, The Financial Times, The Guardian, the BBC, and Computing.