Data from the websites

Khadija Batool Gardezi
2 min readDec 30, 2023

--

Did you know that web scraping is a great way to gather data from different websites?

In this post, I’ll walk you through how I learned to extract data from a website, offering insights and tips for beginners like myself.

Perhaps you have a question: What exactly is web scraping?

Web scraping is a technique to extract large amounts of data from websites. This data is then used for various purposes like data analysis, machine learning projects, or gathering web information. Python’s unique libraries can make this process a lot easier for you. So, why not give it a try and explore the endless possibilities of web scraping?

You can't get the job done without the right tools. Right?

  1. Python: A primary understanding of Python is required.
  2. Libraries: We’ll use requests for handling HTTP requests and BeautifulSoup for parsing HTML content.
  3. Text Editor: Any text editor like VS Code or even a Jupyter Notebook.

Let’s scrape quotes from http://quotes.toscrape.com

First, we need to install the necessary libraries:

pip install requests beautifulsoup4
import requests
from bs4 import BeautifulSoup

# URL
url = "http://quotes.toscrape.com"

# Sending the request
response = requests.get(url)

# Parsing the HTML
soup = BeautifulSoup(response.text, 'html.parser')

# Extracting the quotes
quotes = soup.find_all('span', class_='text')
for quote in quotes:
print(quote.text)

Run this script in your Python environment, and you will see a list of quotes printed in your console.

Performing web scraping requires targeting specific elements within a webpage's HTML structure. My experience in frontend development has helped me efficiently identify elements by their classes, IDs, and other attributes. This skill has been incredibly useful in locating and extracting the data I need, like quotes from a webpage.

Web scraping is an important skill to have in today's data-driven world. This project is only the beginning, and there is a lot more to discover and learn. I suggest that you give it a try and modify the script to scrape various types of data.

Happy coding!
Khadija Batool Gardezi

--

--

Khadija Batool Gardezi

Dev — SignalWire |GitHub Campus Expert // Google DSC // MLSA |