Los Angeles Times News Scraper

This Python project is a web scraping tool designed to extract news articles from the Los Angeles Times website based on user input. The script allows users to search for a specific topic, filter the news by the latest N months, and collect key information from the results. The tool uses Selenium via the rpaframework for browser automation.

Key Features:

User Input-Driven Search: Users can enter a search phrase, a news section, and the number of months to filter.
Web Scraping: Automates interaction with the Los Angeles Times website, collecting news articles and saving relevant data.
Data Extraction: Extracts the following details for each news article:
- Title
- Publication date
- Description
- Associated image filename
Excel File Output: Saves the collected data into an Excel file, with:
- A column for the count of search phrases in the title and description.
- A column indicating whether any mention of money (in various formats) appears in the title or description.
No API or Web Requests: The script operates solely through browser automation without relying on APIs or external web requests.

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
.gitignore		.gitignore
README.md		README.md
conda.yaml		conda.yaml
robot.yaml		robot.yaml
tasks.py		tasks.py
topics_dict.py		topics_dict.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Los Angeles Times News Scraper

Key Features:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Los Angeles Times News Scraper

Key Features:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages