Spider: Content extraction through clustering
This project aims at learning site templates by clustering example pages to generate css selectors that would efficiently pinpoint the main content of the page.
Written by Ziyan Zhou
Related protips
Have a fresh tip? Share with Coderwall community!
Post
Post a tip
Best
#Web-scraping
Authors
Related Tags
#web-scraping
Sponsored by #native_company# — Learn More
#native_title#
#native_desc#