Last Updated: June 16, 2016
·
1.394K
· aafomina

A new web scraping tool for interactive sites

Hey guys. We built ParseHub so you can get data from super complicated and interactive websites. As I like to say - websites partying like its 1999. You can give it a try here.

Everyone here can probably build a web scraping script for a simple static website or page. Thats exactly how we are ParseHub got started writing our algorithms too. Then we discovered all the complexity that goes into pulling data where content loads with AJAX and Javascript. We made it a mission to be the best web scraper on market (technology wise) – but I will let you be the judge of that. :)

Here are a few things ParseHub can do:
- Get data from infinitely scrolling pages
- Get data from behind an log-in. All you have to do is input your email and password
- Automatically fill our forms and send them
- Download images and files to DropBox
- Handle pagination - even AJAX pagination and crawl through millions of pages
- Get HTML, attributes and clean up data with RegEx
- Make the data structured the way you want to with for loops and conditionals to filter our results and text
- Click through a bunch of nested drop downs and get data that loads dynamically
- Enter thousands of search queries into a search box to get results
- Jump from one website to another so you can do things like scraping wages and then converting the currency on another website
- Open tabs, pop-ups and hidden elements on hover
- Get data from maps
- Input thousands of links for ParseHub to crawl through

If you have a super shitty site - like a government site for example send it my way and I will make sure we can get data from it.

You can give it a try here.

Thanks :D