In the past few months I have been spending time learning and experimenting with machine learning (ML) and Big Data. Machine learning seems to require a lot of properly cleaned samples. This is one more case when garbage in implies garbage out. That said; the first step is to collect data. Data can come from different sources i.e., databases, files, public repositories, the Internet, etc. Data can be collected from the Internet in different ways. In general one can collect data from the internet using two main approaches: web scraping and via an API. I will cover both of these approaches in the following posts. Continue reading “BeautifulSoup”