Contributing to LiteParse

Thank you for your interest in contributing to LiteParse! This document provides guidelines and information for contributors.

Getting Started

Fork the repository

Clone your fork:

git clone https://github.com/YOUR_USERNAME/liteparse.git
cd liteparse

Install dependencies:
```
npm install
```
Build the project:
```
npm run build
```

What to Contribute?

In this project, we welcome a wide range of contributions, but we do want to maintain the spirit of the project. We are primarily focused on:

Core algorithms for PDF parsing and text extraction
OCR integrations and improvements
Different types or modifications to output formats

We are less interested in:

Markdown output
Any LLM integration or agent code
Anything that doesn't directly relate to improving the core parsing and extraction capabilities

While the project is in Typescript today, I'm pretty open to porting to Rust if someone wanted to take that on as a contribution. The core algorithms and logic would be the same, just implemented in Rust instead of Typescript.

Development Workflow

Building

npm run build      # Build TypeScript
npm run dev        # Watch mode for development

Testing

npm test           # Run tests
npm run test:watch # Run tests in watch mode

Linting & Formatting

npm run lint       # Check for linting issues
npm run lint:fix   # Fix linting issues
npm run format     # Format code with Prettier

Testing Local Changes

You can test your changes locally:

# Parse a document
./dist/src/index.js parse document.pdf

# Generate screenshots
./dist/src/index.js screenshot document.pdf -o ./screenshots

Making Changes

Versioning & Changelogs

We use Changesets to manage versioning and changelogs. When you make a change to source code that should be released:

Run npm run changeset
Select the type of change (patch, minor, major)
Write a description of your changes
Commit the generated changeset file with your PR

Pull Requests

Fork and create a feature branch from main
Make your changes
Add a changeset if needed (npm run changeset)
Ensure all tests pass (npm test)
Ensure linting passes (npm run lint:fix and npm run format)
Submit a pull request

When you submit a PR, a number of CICD checks will run. Among these, your code will be tested against a regression suite of documents to ensure that your changes don't break existing parsing capabilities. It will be up to the maintainers discretion to determine if any changes to the regression set are expected/positive or unexpected/negative.

PR Guidelines

Keep PRs focused on a single change
Update documentation if needed
Add tests for new functionality
For parsing issues, include a test document if possible

Reporting Issues

Parsing Issues

If you're reporting a problem with document parsing:

You must attach the document or provide a way to reproduce the issue
Include the command you ran
Show the expected vs actual output
Include your LiteParse version (lit --version)

Issues without reproducible examples will be closed.

Bug Reports

For other bugs:

Describe what you expected vs what happened
Include steps to reproduce
Include error messages/stack traces
Include version information

Project Structure

See AGENTS.md for detailed documentation about the codebase structure and architecture.

Key directories:

src/core/ - Main orchestrator and configuration
src/engines/ - PDF and OCR engine implementations
src/processing/ - Text extraction and spatial analysis
src/output/ - Output formatters
cli/ - CLI implementation

Questions?

Open a Discussion for questions
Check existing issues before opening new ones
Read the README for usage documentation

License

By contributing, you agree that your contributions will be licensed under the Apache 2.0 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributing to LiteParse

Getting Started

What to Contribute?

Development Workflow

Building

Testing

Linting & Formatting

Testing Local Changes

Making Changes

Versioning & Changelogs

Pull Requests

PR Guidelines

Reporting Issues

Parsing Issues

Bug Reports

Project Structure

Questions?

License

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to LiteParse

Getting Started

What to Contribute?

Development Workflow

Building

Testing

Linting & Formatting

Testing Local Changes

Making Changes

Versioning & Changelogs

Pull Requests

PR Guidelines

Reporting Issues

Parsing Issues

Bug Reports

Project Structure

Questions?

License