Scrapy on the Fly

I solemnly swear that I am up to social good.

At the most recent Cleveland PyLadies meetup, I vowed to learn about web scraping for data science. Though web scraping can be a controversial subject because of the potential for privacy invasion, intellectual property infringement, or other nefarious activities, I do feel that there are use-cases that harm none and can help many.

The documentation for Scrapy has a great tutorial, but I enjoyed hearing Michael Galarnyk talk through an example on the tutorial he wrote. I found his video through an article he wrote on Towards Data Science.

This slideshow requires JavaScript.

All in all, this brief introduction to Scrapy took me about half an hour. If I had realized the barrier to entry was this low, I may have started down this path much earlier! As it is, I am glad to have a new avenue for finding interesting data sets. It’s not clean data right out of the gate, but being able to organize so much unstructured data into a csv is a great start!

Advertisements

One thought on “Scrapy on the Fly

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.