When you have a website online, there are unseen presences lurking nearby and crawling all over it. We’re talking about the Googlebot, which at times can be your best friend and at times can be a complete nightmare.
As a website owner, you need the Googlebot in its pure form. It exists to help you list your website in what is inarguably the world’s biggest search engine. And when 70 percent of the world’s Internet users are going to visit your site through Google, there’s no locking Googlebot out.
You’re probably wondering why you’d consider locking Googlebot out in the first place. Well, the problem isn’t Googlebot in and of itself. The problem is all of Googlebot’s tag-alongs who look just like Googlebot and do a lot of what Googlebot does, only with a hearty side of destruction.
It’s like Invasion of the Body Snatchers, only instead of aliens replacing humans with identical but unfeeling doubles that intend to take over the planet, the Internet has been invaded with Googlebot impostors that look just like Googlebot and seem to act just like Googlebot, but have no interest in helping you with your website listing, search engine ranking or web presence.
Not all of these bot-y snatchers are out to harm your website, but many of them are. Googlebot impostors can leave you open to a Distributed Denial of Service (DDoS) attack, and leave you open to losing the web presence you’ve worked so hard for.
The Real McCoy
As we’ve said, the real Googlebot visiting your website is a good thing. Research, not to mention common sense, shows that it has the broadest reach of any search engine bot, accounting for nearly 60 percent of all visits by bots to any given website. That’s more than twice the activity of MSN/Bing’s bot and over 10 times the activity of the Chinese search giant Baidu.
The more content you have, and the more frequently you update it, the more Googlebot comes to visit. Large webstores, news websites, and active forums are the Googlebot’s favorite hangouts. That’s an easy tip for those involved in ranking their websites — the more content you have, and the more you update it, the better you will rank in Google.
The Googlebot Clones
Look, other than the people creating them, nobody likes clones. They’re creepy, and they’re the subject of crappy Star Wars movies. We’d all be better off without them and so would your website. Research shows that 1 in 25 of bots arriving at websites that claim to be from Google are fake. Of these impostors, 35 percent have clear and direct malicious intent, and 25 percent of those with malicious intent will be used in Layer 7 DDoS attacks.
It makes sense that these Googlebot impostors would be involved in Layer 7 DDoS attacks since these attacks simulate the behavior of a real user. That makes it much harder for even the cleverest systems administrator to detect them early and defend against them, which in turn means it’s easy for this DDoS attack to bring down the webserver and to exert pressure over the unsuspecting owner.
When the worst of these Googlebot wannabes aren’t dummying websites, they’re scraping content, spamming users, and hacking your databases and codes. The more benign Googlebot impostors are on your site to gather information. Whether that information is going to be used for market research, to design more advanced bots, or to figure out how to leave the most convincing comments about prescription pills from Canada is anyone’s guess. We do know the results of this information mining are probably going to be annoying.
Botnets – the Googlebot impostors’ Home Base
Incapsula’s recent research into the two faces of Googlebot showed that fake Googlebots were the 3rd most commonly used form of bot to carry out DDoS attacks. To do this, they examined over 400 million search engine bot visits to 10,000 sites. It’s worth noting that this sample represents over 2 billion page crawls in a month.
The main sources of these attacks are “botnets,” a collection of compromised devices that are all connected via the Internet. The most common sources of these botnets were the USA, China, Turkey, Brazil, India, and Thailand, with a few coming from other sources.
Brazil may seem like an odd entrant on the list but the theory is that during the World Cup so many visitors took infected devices with them that they artificially boosted Brazil’s placing in the global league tables. An embarrassing World Cup loss to Germany, and a new reputation for botnets. Weep for Brazil.
A DDoS attack using fake Googlebots can last for days, weeks or in extreme cases for months. These attacks are damaging to a website and in the worst cases, they can take the site offline to its users – often costing businesses a fortune.
What Can Be Done About Fake Googlebots?
The good news is that Fake Googlebots can be identified as they start to crawl your website. The bad news is that most website owners don’t have the processing power or software to implement these solutions. However, large corporations can and should be able to take action.
The bots can found using a range of security measures that include ASN and IP identification. These techniques allow systems administrators to determine where a bot is from and thus compare it to the sources of legitimate bots. If a bot originates in Turkey or Thailand, for example, it may require further security measures to prevent a DDoS attack.