A powerful, modern Python application for downloading, processing, and analyzing SEC EDGAR filings with professional-grade workflow automation.
π Quick Start β’ π Documentation β’ πΌ Examples β’ π§ Installation β’ β‘ Workflows
py-sec-edgar transforms complex SEC filing data into accessible, structured information with enterprise-grade reliability and ease of use:
- ποΈ Professional Workflow System: Four specialized workflows for different data collection needs
- β‘ High-Performance Processing: Efficient bulk download and processing of SEC archives
- ποΈ Advanced Filtering: Filter by ticker symbols, form types, date ranges, and more
- π Real-Time Monitoring: RSS feed integration for live filing notifications
- π Structured Data Extraction: Extract and parse filing contents automatically
- π‘οΈ Enterprise-Ready: Robust error handling, logging, and configuration management
- π Modern Python: Built with Python 3.10+, type hints, and modern best practices
- π Investment Research: Download 10-K/10-Q filings for fundamental analysis
- π Compliance Monitoring: Track insider trading (Form 4) and ownership changes
- π° News & Events: Monitor 8-K filings for material corporate events
- π« Academic Research: Bulk download historical filing data for studies
- π€ Machine Learning: Create datasets for NLP and financial prediction models
- π Portfolio Management: Automated due diligence for investment portfolios
Get up and running with py-sec-edgar in under 2 minutes:
# Install uv if you haven't already
pip install uv
# Clone and setup the project
git clone https://github.com/ryansmccoy/py-sec-edgar.git
cd py-sec-edgar
# Install dependencies
uv sync
# Verify installation
uv run python -m py_sec_edgar --help# First, explore what's available without downloading (safe exploration)
uv run python -m py_sec_edgar workflows rss --show-entries --count 10 --list-only
# See what Apple filings are available without downloading
uv run python -m py_sec_edgar workflows daily --tickers AAPL --days-back 7 --forms "8-K" --no-download
# When ready, download Apple's latest 10-K annual report (includes 2025Q3 data)
uv run python -m py_sec_edgar workflows full-index --tickers AAPL --forms "10-K" --download --extract
# Process the latest quarterly data (2025Q3)
uv run python -m py_sec_edgar workflows full-index --quarter 2025Q3 --download --extract
# Monitor recent filings for your portfolio (explore first, then download)
uv run python -m py_sec_edgar workflows daily --tickers AAPL --tickers MSFT --tickers GOOGL --days-back 7 --forms "8-K" --no-download
# When satisfied, add --download flag to actually download files
# Monitor Apple's earnings announcement from August 1, 2024
uv run python -m py_sec_edgar workflows daily --tickers AAPL --start-date 2024-08-01 --end-date 2024-08-01 --forms "8-K" --download --extract
# Real-time RSS monitoring (list mode for exploration)
uv run python -m py_sec_edgar workflows rss --show-entries --count 10 --list-only# Your downloaded filings are organized like this:
sec_data/
βββ Archives/edgar/data/
β βββ 320193/ # Apple's CIK
β βββ 000032019324000123/ # Specific filing
β βββ aapl-20240930.htm # Main 10-K document
β βββ exhibits/ # All exhibits
β βββ Financial_Report.xlsx # Structured financial data- Python 3.10+ (Required)
- uv package manager (Recommended) or pip
- 5GB+ disk space for substantial data collection
# Clone the repository
git clone https://github.com/ryansmccoy/py-sec-edgar.git
cd py-sec-edgar
# Install with uv (handles everything automatically)
uv sync
# Install with pip (alternative)
pip install -e .# Install from PyPI
pip install py-sec-edgar
# Or with uv
uv pip install py-sec-edgar# For production environments
uv pip install py-sec-edgar[prod]
# For development with all tools
uv sync --extra devpy-sec-edgar provides four specialized workflows, each optimized for different use cases. Each workflow has comprehensive documentation with dozens of real-world examples:
| Workflow | Best For | Data Source | Time Range | Full Documentation |
|---|---|---|---|---|
| π Full Index | Historical research, bulk analysis | Quarterly archives | All historical data | π Complete Guide |
| π Daily | Recent monitoring, current events | Daily index feeds | Last 1-90 days | π Complete Guide |
| π Monthly | XBRL structured data | Monthly XBRL archives | Monthly intervals | π Complete Guide |
| π‘ RSS | Real-time monitoring | Live RSS feeds | Real-time updates | π Complete Guide |
Understanding when SEC data is available helps you choose the right workflow for your needs:
| Data Type | Update Frequency | Availability | Best Workflow | Notes |
|---|---|---|---|---|
| π΄ Live Filings | Real-time | As filed | RSS | Immediate access to new filings |
| π Daily Index | Nightly at 10 PM ET | Previous business day | Daily | Complete daily filing lists |
| π Full Index | Updated throughout quarter | Current quarter + historical | Full Index | Comprehensive quarterly data |
| π Quarterly Index | End of quarter | Complete quarter (static) | Full Index | Final quarterly archives |
| π Weekly Rebuild | Saturday mornings | All corrected data | All workflows | Post-acceptance corrections included |
Key Update Schedule Details:
- π Daily Index Files: Updated nightly starting around 10:00 PM ET with the previous business day's filings
- π Full Index Files: Updated continuously throughout the current quarter, including all filings from quarter start through the previous business day
- π Quarterly Index Files: Static archives created at quarter-end containing the complete, final quarterly data
- π§ Weekly Rebuilds: Every Saturday morning, all full and quarterly index files are rebuilt to incorporate post-acceptance corrections and amendments
- β‘ Real-time RSS: Live feed updated immediately as filings are accepted by the SEC
π Data Currency Best Practices:
- For current events: Use RSS workflow for immediate access to breaking filings
- For recent activity: Use Daily workflow for systematic monitoring of the last 1-90 days
- For historical research: Use Full Index workflow for comprehensive quarterly archives
- For completeness: Wait until Saturday morning rebuild for the most accurate quarterly data
π‘ Pro Tip: Each workflow documentation contains 20+ practical examples, from basic usage to advanced enterprise patterns. Start with the Workflow Documentation Hub for complete coverage!
Perfect for comprehensive historical analysis and bulk data collection
# First, explore what's available for Apple without downloading
uv run python -m py_sec_edgar workflows full-index --tickers AAPL --no-download
# When ready, download all Apple filings from quarterly archives
uv run python -m py_sec_edgar workflows full-index --tickers AAPL --download
# Process the latest quarterly data (2025Q3) with extraction
uv run python -m py_sec_edgar workflows full-index --quarter 2025Q3 --download --extract
# Investment research: Explore tech giants first, then download
uv run python -m py_sec_edgar workflows full-index \
--tickers AAPL --tickers MSFT --tickers GOOGL --tickers AMZN --tickers META \
--forms "10-K" \
--no-download # Remove this flag when ready to download
# Academic research: Fortune 500 analysis with latest data
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/fortune500.csv \
--forms "10-K" "10-Q" \
--quarter 2025Q3 \
--download --extractIdeal for monitoring recent activity and staying current
# Explore yesterday's filings without downloading first
uv run python -m py_sec_edgar workflows daily --days-back 1 --no-download
# When ready, download yesterday's filings
uv run python -m py_sec_edgar workflows daily --days-back 1 --download
# Weekly portfolio monitoring (explore first)
uv run python -m py_sec_edgar workflows daily \
--ticker-file examples/portfolio.csv \
--days-back 7 \
--forms "8-K" "4" \
--no-download # Remove this flag when ready to download
# Monitor Apple's specific earnings announcement (August 1, 2024)
uv run python -m py_sec_edgar workflows daily \
--tickers AAPL \
--start-date 2024-08-01 \
--end-date 2024-08-01 \
--forms "8-K" \
--download --extract # Direct download since we know what we wantSpecialized for XBRL structured financial data
# Explore what structured financial data is available (6 months)
uv run python -m py_sec_edgar workflows monthly --months-back 6 --no-download
# Download structured financial data when ready
uv run python -m py_sec_edgar workflows monthly --months-back 6 --download
# Focus on specific companies with extraction
uv run python -m py_sec_edgar workflows monthly \
--tickers AAPL --tickers MSFT \
--months-back 12 \
--download --extractReal-time monitoring and live feed processing
# Explore latest filings in real-time (safe exploration)
uv run python -m py_sec_edgar workflows rss --show-entries --count 20 --list-only
# Monitor specific companies (list mode first)
uv run python -m py_sec_edgar workflows rss \
--query-ticker AAPL \
--count 10 \
--show-entries --list-only
# When ready to process/download, remove --list-only flag
uv run python -m py_sec_edgar workflows rss \
--query-ticker AAPL \
--count 10 \
--download
# Save RSS data for analysis (no download, just save feed data)
uv run python -m py_sec_edgar workflows rss \
--save-file rss_filings.json \
--count 100 \
--list-onlyScenario: You're analyzing potential investments in the renewable energy sector.
# Step 1: Use the provided renewable energy ticker list
# File: examples/renewable_energy.csv (already created)
# Step 2: Explore historical annual reports first (no download)
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/renewable_energy.csv \
--forms "10-K" \
--no-download
# Step 3: When ready, get historical annual reports with extraction
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/renewable_energy.csv \
--forms "10-K" \
--download --extract
# Step 4: Process specific quarterly filings (2025Q3)
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/renewable_energy.csv \
--quarter 2025Q3 \
--forms "10-Q" \
--download --extract
# Step 5: Monitor recent Tesla activity (last 30 days for better data coverage)
uv run python -m py_sec_edgar workflows daily \
--tickers TSLA \
--days-back 30 \
--forms "8-K" \
--no-download # Explore first, then add --download when ready
# Step 6: Set up real-time monitoring (exploration mode)
uv run python -m py_sec_edgar workflows rss \
--query-ticker TSLA \
--count 10 \
--show-entries --list-onlyResult: Complete dataset with historical context, recent activity, and real-time monitoring setup.
Scenario: Studying CEO compensation trends across S&P 500 companies.
# Step 1: Explore proxy statements availability (no download)
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/sp500_tickers.csv \
--forms "DEF 14A" \
--no-download
# Step 2: Download proxy statements when ready
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/sp500_tickers.csv \
--forms "DEF 14A" \
--download --extract
# Step 3: Process latest quarterly data (2025Q3) for comprehensive analysis
uv run python -m py_sec_edgar workflows full-index \
--ticker-file examples/sp500_tickers.csv \
--quarter 2025Q3 \
--forms "10-Q" "DEF 14A" \
--download --extract
# Step 4: Get recent quarterly filings (last 60 days for good data coverage)
uv run python -m py_sec_edgar workflows daily \
--ticker-file examples/sp500_tickers.csv \
--days-back 60 \
--forms "10-Q" \
--no-download # Explore first
# Step 5: Extract structured financial data for analysis
uv run python -m py_sec_edgar workflows monthly \
--ticker-file examples/sp500_tickers.csv \
--months-back 12 \
--download --extractScenario: Monitor insider trading and ownership changes for your portfolio.
# Step 1: Explore recent insider trading (Form 4) - last 7 days
uv run python -m py_sec_edgar workflows daily \
--ticker-file examples/portfolio.csv \
--days-back 7 \
--forms "4" \
--no-download # Explore first
# Step 2: When ready, download recent insider trading data
uv run python -m py_sec_edgar workflows daily \
--ticker-file examples/portfolio.csv \
--days-back 14 \
--forms "4" \
--download --extract
# Step 3: Track large ownership changes (last 30 days)
uv run python -m py_sec_edgar workflows daily \
--ticker-file examples/portfolio.csv \
--days-back 30 \
--forms "SC 13G" "SC 13D" \
--download --extract
# Step 4: Set up real-time insider trading alerts (exploration mode)
uv run python -m py_sec_edgar workflows rss \
--query-form "4" \
--count 25 \
--show-entries --list-onlyScenario: Stay ahead of market-moving news with automated 8-K monitoring.
# Monitor Apple's recent activity (last 30 days for good coverage)
uv run python -m py_sec_edgar workflows daily \
--tickers AAPL \
--days-back 30 \
--forms "8-K" \
--no-download # Explore first, then add --download
# Monitor Tesla's recent activity (last 30 days)
uv run python -m py_sec_edgar workflows daily \
--tickers TSLA \
--days-back 30 \
--forms "8-K" \
--no-download # Explore first
# When ready to download Apple's recent annual reports
uv run python -m py_sec_edgar workflows daily \
--tickers AAPL \
--days-back 90 \
--forms "10-K" \
--download --extract
# Set up comprehensive current events monitoring (exploration mode)
uv run python -m py_sec_edgar workflows rss \
--query-form "8-K" \
--show-entries \
--count 25 \
--list-only
# Advanced: Monitor multiple companies for 8-K filings
uv run python -m py_sec_edgar workflows daily \
--ticker-file examples/portfolio.csv \
--days-back 14 \
--forms "8-K" \
--no-downloadpy-sec-edgar makes it easy to work with SEC filings, but understanding what each form contains helps you choose the right data:
| Form | Description | Frequency | Key Content |
|---|---|---|---|
| 10-K | Annual Report | Yearly | Complete business overview, audited financials, risk factors |
| 10-Q | Quarterly Report | Quarterly | Unaudited quarterly financials, updates since last 10-K |
| 8-K | Current Events | As needed | Material corporate events, breaking news |
| DEF 14A | Proxy Statement | Annually | Executive compensation, board elections, shareholder proposals |
| 4 | Insider Trading | Within 2 days | Executive stock transactions |
| SC 13G/D | Beneficial Ownership | When threshold crossed | Large shareholder positions (>5%) |
SEC Website Structure:
https://www.sec.gov/Archives/edgar/data/[CIK]/[AccessionNumber]/[Filename]
py-sec-edgar Local Structure:
sec_data/
βββ Archives/edgar/
β βββ full-index/ # Downloaded quarterly archives
β β βββ 2024/QTR1/
β β βββ 2024/QTR2/
β β βββ 2025/QTR3/ # Latest quarterly data
β βββ data/ # Extracted filing contents
β βββ [CIK]/ # Company folders (e.g., 320193 for Apple)
β βββ [Filing]/ # Individual filing folders
β βββ main_document.htm
β βββ exhibits/
β βββ Financial_Report.xlsx
Central Index Key (CIK): Unique numerical identifier assigned by SEC
- Example: Apple Inc. = 320193
- Permanent, never recycled
- Used in all SEC filings and URLs
Ticker Symbol: Stock exchange trading symbol
- Example: AAPL for Apple Inc.
- Can change due to rebranding, mergers
- py-sec-edgar handles ticker-to-CIK mapping automatically
| Form Type | Total Filings | Average per Year | Primary Use Case |
|---|---|---|---|
| Form 4 | 6,420,154 | ~800,000 | Insider trading monitoring |
| 8-K | 1,473,193 | ~180,000 | Breaking news and events |
| 10-Q | 552,059 | ~70,000 | Quarterly earnings analysis |
| 10-K | 180,787 | ~22,000 | Annual comprehensive analysis |
| 13F-HR | 224,996 | ~28,000 | Institutional holdings tracking |
py-sec-edgar works out of the box with sensible defaults from .env.example. For custom configuration, create a .env file:
# Copy the example file and customize
cp .env.example .envKey environment variables:
# SEC Data Directory (cross-platform)
SEC_DATA_DIR=./sec_data
# User Agent (Required by SEC)
USER_AGENT="YourCompany [email protected]"
# Request Settings (Conservative defaults)
REQUEST_DELAY=5.5
MAX_RETRIES=3
# Logging Configuration
LOG_LEVEL=WARNING
DEBUG=falseπ‘ Important: You must update
USER_AGENTwith your contact information for production use, as required by SEC guidelines.
Create CSV files with ticker symbols (or use the provided examples):
# examples/portfolio.csv
TICKER
AAPL
MSFT
GOOGL
AMZN
TSLA
# Or simple format
AAPL
MSFT
GOOGLpy-sec-edgar provides two Python APIs for programmatic usage:
from py_sec_edgar import SEC, Forms
async with SEC(data_dir="./sec_data") as sec:
# Download filings for specific companies
result = await sec.download(
tickers=["AAPL", "MSFT"],
forms=[Forms.FORM_10K],
days=365
)
print(f"Downloaded {result.file_count} files")
# List downloaded filings
filings = await sec.list_filings(ticker="AAPL")from py_sec_edgar import SECFeed, SECFeedConfig
from py_sec_edgar.reporters import RichProgressReporter
# SECFeed provides: DuckDB storage, blob storage, search, caching
async with SECFeed(
tickers=["AAPL", "MSFT"],
forms=["10-K", "10-Q"],
days=365,
enable_search=True,
enable_cache=True,
) as feed:
# Collect with progress reporting
await feed.collect(progress=RichProgressReporter())
# Typed access to filings
async for filing in feed.filings(form_type="10-K"):
print(f"{filing.content.company_name}: {filing.content.accession_number}")
# Full-text search
results = await feed.search("revenue growth", limit=10)
# Download documents (cached in blob storage)
doc = await feed.download_document(filing_url)from py_sec_edgar.workflows import (
run_full_index_workflow,
run_daily_workflow,
run_monthly_workflow,
run_rss_workflow
)
# Run full index workflow
run_full_index_workflow(
tickers=["AAPL", "MSFT"],
forms=["10-K", "10-Q"],
extract=True
)
# Monitor recent filings
run_daily_workflow(
tickers=["AAPL", "MSFT"],
days_back=7,
forms=["8-K"],
extract=True
)# Clone repository
git clone https://github.com/ryansmccoy/py-sec-edgar.git
cd py-sec-edgar
# Setup development environment
uv sync --extra dev
# Install pre-commit hooks
uv run pre-commit install
# Run tests
uv run pytest
# Run linting
uv run ruff check
uv run ruff format
# Type checking
uv run mypy src/# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=py_sec_edgar --cov-report=html
# Run specific test categories
uv run pytest -m "not slow" # Skip slow tests
uv run pytest -m integration # Integration tests only
uv run pytest tests/test_filing.py # Specific test file# Test with small dataset
uv run python -m py_sec_edgar workflows full-index \
--tickers AAPL \
--forms "10-K" \
--no-extract
# Benchmark larger operations
time uv run python -m py_sec_edgar workflows daily \
--tickers AAPL --tickers MSFT --tickers GOOGL \
--days-back 30 \
--extract- Full Documentation: Complete API reference and guides
- Workflow Documentation Hub: Detailed workflow guides with comprehensive examples
- π Full Index Workflow: Complete quarterly archive processing for historical research
- π Daily Workflow: Recent filings monitoring and systematic updates
- π Monthly Workflow: XBRL structured data processing for quantitative analysis
- π‘ RSS Workflow: Real-time RSS feed processing with advanced querying
- CLI Reference: Complete command-line interface documentation
- Configuration Guide: Environment and settings configuration
- API Reference: Programmatic usage documentation
- Troubleshooting: Common issues and solutions
- User Agent Required: The SEC requires a proper User-Agent header with your contact information
- Rate Limiting: py-sec-edgar includes respectful rate limiting (0.1s delay by default)
- Fair Use: Please be respectful of SEC resources and don't overwhelm their servers
- Full Index Processing: Can generate several GB of data per quarter
- Extracted Content: Individual filings can be 10-100MB when extracted
- Recommendation: Start with specific tickers/forms, then scale up
- Public Data Only: All data accessed is publicly available SEC filings
- No Personal Info: py-sec-edgar only accesses corporate disclosure documents
- Compliance Ready: Suitable for professional and academic use
We welcome contributions! Here's how to get started:
- Check existing issues
- Create detailed bug reports with examples
- Include system information and error logs
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes with tests
- Run the test suite:
uv run pytest - Submit a pull request
- Improve existing documentation
- Add new examples and use cases
- Create tutorials for specific workflows
py-sec-edgar is dual-licensed:
- Personal Use: MIT License (free for personal, educational, and research use)
- Commercial Use: GNU AGPLv3 License (free with copyleft requirements)
- Business Licensing: Contact [email protected] for commercial licensing options
See LICENSE for full details.
- GitHub Issues: Bug reports and feature requests
- Discussions: Questions and community support
- Documentation: Comprehensive guides and API reference
- Business Inquiries: [email protected]
- Commercial Licensing: Available for enterprise use
- Custom Development: Professional services available
- SEC EDGAR System: For providing free access to corporate filing data
- Python Community: For the excellent libraries that make this project possible
- Contributors: Everyone who has contributed code, documentation, and feedback
- Users: The community that drives continuous improvement
β Star this repository if py-sec-edgar helps your financial analysis! β
Built with β€οΈ for the financial analysis and research community
π Homepage β’ π Docs β’ π Issues β’ π¬ Discussions