Data Acquisition

⚙ API

Public Data

Private Data

Manual Data Acquisition

🤖 Web scraping tools:

References:

Application Programming Interfaces (APIs):

APIs are built around the HTTP Request/Response Cycle
Many APIs request that users sign up and obtain an API key that uniquely identifies them and records of all a user’s requests

The API process

💻 Client

Request

⚙ API

🗄 Server

Response

References:

Ethics — With all data acquisition methods, there are ethical considerations that are important to make:

Who owns the data uploaded to a website by users?
When and how should users of services be notified that data about them is being acquired?
What kinds of data should be restricted from being acquired about users?
How can users protect their privacy and know when it has been breached?

Data acquisition should be:

Making an API request

1) Import requests library

2) Use .get() method to return the data from the desired URL

3) Use .json method to access decoded JSON data as a Python object

4) Import csv library

5) Use .writerows() method to convert JSON to CSV

6) Import pandas library and use the .read_csv() function to read the CSV data into a dataframe object

Rules of scraping:

Web scraping steps

3) Import BeautifulSoup from bs4

4) Use .content function to turn the website into a BeautifulSoup object

5) To retrieve the relevant info, you can use: