Skip to content

API performance issues: slow search and transcriptions_exist timeout #35

@tacman

Description

@tacman

API Performance Issues

Testing the v2 API with curl shows significant performance problems:

1. Basic Search is Slow

Even simple searches take ~2 seconds:

$ time curl "https://catalog.archives.gov/api/v2/records/search?q=constitution&limit=1" \
  -H "x-api-key: YOUR_KEY" -o /dev/null

200 1.81s  # Takes ~2 seconds for single result

2. transcriptions_exist Parameter Causes Timeout

Using transcriptions_exist=true causes 503/504 Gateway Time-out:

$ time curl "https://catalog.archives.gov/api/v2/records/search?q=constitution&limit=1&transcriptions_exist=true" \
  -H "x-api-key: YOUR_KEY" -o /dev/null -w "%{http_code}"

503 Gateway Time-out

Questions

  1. Is transcriptions_exist the correct parameter for filtering records that have transcriptions?
  2. Are there alternative parameters for this filtering?
  3. Can search performance be improved?

Workaround

The /transcriptions/search endpoint works correctly and can be used instead to find records with transcriptions.


Tested with API key provided for development.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions