I solemnly swear that I am up to social good.
At the most recent Cleveland PyLadies meetup, I vowed to learn about web scraping for data science. Though web scraping can be a controversial subject because of the potential for privacy invasion, intellectual property infringement, or other nefarious activities, I do feel that there are use-cases that harm none and can help many.
The documentation for Scrapy has a great tutorial, but I enjoyed hearing Michael Galarnyk talk through an example on the tutorial he wrote. I found his video through an article he wrote on Towards Data Science.
All in all, this brief introduction to Scrapy took me about half an hour. If I had realized the barrier to entry was this low, I may have started down this path much earlier! As it is, I am glad to have a new avenue for finding interesting data sets. It’s not clean data right out of the gate, but being able to organize so much unstructured data into a csv is a great start!