Please enable JavaScript.
Coggle requires JavaScript to display documents.
11 Web Scraping ((Controlling the Browser with the selenium Module . 256,…
11
Web Scraping
HTML . 240
Resources for Learning HTML 240
A Quick Refresher 240
Viewing the Source HTML of a Web Page . 241
Opening Your Browser’s Developer Tools 242
Using the Developer Tools to Find HTML Elements 244
Downloading Files from the Web with the requests Module 237
Downloading a Web Page with the requests.get() Function . 237
Checking for Errors 238
Saving Downloaded Files to the Hard Drive . 239
Parsing HTML with the BeautifulSoup Module . 245
Creating a BeautifulSoup Object from HTML 245
Finding an Element with the select() Method 246
Getting Data from an Element’s Attributes . 248
Project: mapIt.py with the webbrowser Module . 234
Step 1: Figure Out the URL 234
Step 2: Handle the Command Line Arguments . 235
Step 3: Handle the Clipboard Content and Launch the Browser . 236
Ideas for Similar Programs 236
Project: Downloading All XKCD Comics . 251
Step 1: Design the Program . 252
Step 2: Download the Web Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Step 3: Find and Download the Comic Image . 254
Step 4: Save the Image and Find the Previous Comic . 255
Ideas for Similar Programs 256
Controlling the Browser with the selenium Module . 256
Starting a Selenium-Controlled Browser . 256
Finding Elements on the Page 257
Clicking the Page . 259
Filling Out and Submitting Forms 259
Sending Special Keys . 260
Clicking Browser Buttons . 261
More Information on Selenium . 261
Project: “I’m Feeling Lucky” Google Search 248
Step 1: Get the Command Line Arguments and Request the Search Page 249
Step 2: Find All the Results 249
Step 3: Open Web Browsers for Each Result . 250
Ideas for Similar Programs 251