Search engine for HIA.
Synopsis: a dockerized python API to search HIA.
Based on langchain, powered by language models. Uses Poetry for dependency management and Azure AI Search for indexing and searching.
Largely inspired by this and this project, kudos to the authors.
The /create-vector-store
endpoint accepts a googleSheetId
parameter, fetches all data from the Q&A sheet, and creates an index in Azure AI Search. If the index already exists, its content will be updated.
If the Q&A sheet is not publicly accessible, you can pass its content to the data
parameter. The content must be a valid JSON object structured as this.
The /search
endpoint accepts three parameters:
query
: the search querygoogleSheetId
: the Google Sheet ID (must be already indexed via/create-vector-store
)lang
: the language of the search query; results will be translated to this languagek
: the number of results to return
and returns a list of relevant questions and answers, in this format:
[
{
"categoryID": 8,
"subcategoryID": 38,
"slug": "disabilities",
"question": "Where can I go when I have special care needs?",
"answer": "To receive specialist care, you often first need a referral from your General Practitioner (GP).",
"score": 0.3011241257190705,
"children": [
{
"categoryID": 8,
"subcategoryID": 38,
"question": "Social Security Act (WMO)",
"answer": "Support for persons with disabilities is provided through the Social Security Act (WMO).",
"score": 0
}
]
}
]
For the rest, see the docs.
cp example.env .env
and edit the provided ENV-variables accordingly.
pip install poetry
poetry install --no-root
uvicorn main:app --reload
docker build -t hia-search .