A Rails application for navigating PDF accessibility audits. We use traditional NLP and LLM processes to prioritize and stratify documents, guiding stakeholders through corrective action decision-making. For additional documentation, see the docs folder.
Features:
- Scrape websites for PDF documents and harvest metadata
- Use NLP to classify documents by category
- Dashboards and audit workflow to help navigate decision-making
- LLM-powered tools to summarize and perform policy analysis on documents
The feature set is an approach to organize your audit for ADA compliance - the features do not remediate documents or identify which parts of documents are inaccessible.
Fill out this Google Form to connect with Code for America and learn more.
ASAP has two major architectural realms, a Ruby on Rails powered user interface and Python components, which provide many of the document processing and AI features. The Rails-based UI is not required. The Python components may each be used as standalone services.
- Frontend: Hotwired (Turbo + Stimulus) with Tailwind CSS
- Backend: Ruby on Rails 8.0
- Database: PostgreSQL
- Testing: RSpec and Capybara
The application includes several Python components for PDF processing:
- Site Crawler: Downloads PDF files and metadata from government websites
- Document Classifier: Determines document types using machine learning
- Document Inference: Generates LLM summaries and performs exception checks
- Evaluation: Automated LLM evaluation suite
For detailed information about the Python components, see the Python Components documentation. Architectural diagrams are available in the Architecture documentation.
The following instructions are intended for local development on macOS or Linux environments. To run the app locally on Windows follow the Windows setup guide instead. The staging and production documentation details running the app on Amazon Web Services (AWS), however any cloud hosting platform should be possible.
Before you begin, ensure you have the following installed:
- Ruby 3.3.4 (we recommend using a version manager like
rbenv
orrvm
) - Node.js 24.4.1 (we recommend using
nvm
for version management) - Yarn (latest version)
- PostgreSQL locally or in a container
- Docker and Docker Compose (for LocalStack AWS services in development)
- Optional, to use the LLM features, API credentials for Google, Anthropic or OpenAI
To run all the App's features, you will need to run the Rails application and Python components in separate terminal processes. For simplicity’s sake, we run the Rails application locally (on the host machine), while the Python components run in a Docker Compose (see docker-compose.yml). For the LLM features to function correctly, credentials must be added to the Rail UI while the Python components are running. This simulates the production environment most directly.
-
Clone the repository:
git clone https://github.com/codeforamerica/asap_pdf.git cd asap_pdf
-
Install Ruby dependencies:
bundle install
-
Install JavaScript dependencies:
yarn install
-
Set up the database:
rails db:migrate ; rails db:setup
-
Test running the app:
bin/dev
-
A default admin user was created during database setup. See seeds.rb for details. To add your own admin user, run the following rake task:
rake users:create_admin"[<your email>,<your password>]"
-
Add admin credentials for API usage by Python components. For local development, add admin credentials to the development configuration as
api_user
andapi_password
.`EDITOR="Your editor" rails credentials:edit --environment development`
The final configuration should look something like:
api_user: <your email> api_password: <your password>
- From the project root, build images
docker compose build
- Start the containers
docker compose up
- In a separate terminal process, run the Rails app.
bin/dev
- Log into the app, visit
/configuration/edit
. Enter just LLM API credentials you'd like to use and save the form. The credentials are ephemerally stored in the LocalStack clone of AWS SecretsManager. If you restart the Docker containers, the API credentials must be reentered.
When the database is set up (bin/rails db:setup
), it is populated with some sample sites, sandbox documents and an admin user.
- Admin users may be created via the admin user rake task:
bin/rake users:create_admin"[<email>, <password>]"
- To create non-admin users, log into the app with an admin user and use the administrative user interface at
/admin/users
.
- After logging in, sites may be added by "Site Admin" users through the UI on the
/sites
page.
- Documents may be imported via the document import rake task:
bin/rake documents:import_documents"[<site id>, <path to csv>, <archive bool*>]"
- Sample csv documents may be found in db/seeds. The format should match the output of the crawl and classification processes. Check out the documentation in the python_components directory for more details.
*Set to true to import a csv inside a zip archive.
Users, Sites and Documents may be added manually via the Rails console as well.
This project's code is licensed under the Apache 2.0 License. All assets, such as images provided by Code for America are licensed under the CC-BY-4.0 license.