Everyone likes a good price comparison tool. When combined with Amazon or eBay, these tools allow you to get the best possible deals on that nice smart watch you’ve been wanting for a while. But do you know how these tools work? Or that the underlying technology can be turned against your website and used to attack you?
Most of the price comparison tools work by using a technique called web scraping. Web scraping is a technique used to automatically collect information from websites. It is commonly utilized by Google and other search engine providers to crawl websites and compile information.
While there are benign uses for the technique, there are other uses that make web scraping a major threat to websites today. These uses range from data theft to targeted attacks. Some of the abusive actions that are performed using web scraping include harvesting email addresses for spam lists, collecting competitive intelligence, plagiarism and republishing of content, publishing comparative pricing, auction sniping, etc. Any or all of these actions can have a negative impact on a business. For instance, a competitor may be crawling your website in order to gather competitive intelligence to use to their advantage. Or another website may be stealing your copyrighted content and publishing it on their site for profit.
In some cases, web crawlers can be used to mount attacks on a website. By using an army of automated crawlers, a malicious actor can overload a web server and bring a site down. In 2013, there was a case identified, where malicious actors tricked the GoogleBot into executing SQL Injection attacks on a website!
Any public website that needs to be placed well in search rankings needs to allow search engine crawlers to index them. However, they should also be able to prevent bad actors from exploiting this need. The Barracuda Web Application Firewall (WAF) makes it easy for you to protect your website from Web Scraping. To tackle increasingly sophisticated web scrapers, our WAF includes multiple protection mechanisms against scrapers, making it easy for you to protect your website.
Web scraping can be configured by navigating to the Websites – > Web Scraping page. Here, you can configure policies to block bots and to review or modify allow-listed bots.
Search engine crawlers are allow-listed and validated using reverse DNS lookups on their IP addresses – these are valid bots that need to be allowed to index your site for listing on the search engine results. This allow-listing also helps identify fake googlebots, etc.
Web Scraping protection is available from Firmware Release 8.1 onwards. To learn more about new releases and best practices for upgrading your Barracuda Web Application Firewall, please visit Barracuda Campus here.
The Barracuda Web Application Firewall provides complete protection against all web attacks and enhances the performance of your website site or service. For more information on the Barracuda Web Application Firewall, visit the product page here. To get a risk-free 30-day trial of a physical appliance or virtual edition of the Barracuda WAF, visit this page.
Tushar Richabadas is a Product Manager for the Barracuda Web Application Firewall and Barracuda Load Balancer ADC. His current areas of focus are Cloud and automation. His prior roles ranged from leading networking product testing teams and technical marketing for HCL-Cisco. Tushar closely tracks the rapidly increasing impact of digital security and is passionate about simplifying digital security for everyone.