Architecture Deep Dive

Architecture

Description

The user sends a request to the application hosted on AWS Amplify.
Amplify integrates with the backend API Gateway.
Instructors can upload patient data to the application, which is stored in an S3 bucket using a pre-signed upload URL.
Adding a new file to the S3 bucket triggers the data ingestion workflow. The Lambda function runs a Docker container with Amazon Elastic Container Registry (ECR) and uses LangChain to process documents.
The Lambda function embeds the text from uploaded files into vectors using Amazon Bedrock. This project uses the Amazon Titan Text Embeddings V2 model to generate embeddings.
The Lambda function stores the vectors in the PostgreSQL database.
Users can perform simulation group management/access actions by sending an API request which invokes a Lambda function.
This Lambda function interacts with Amazon RDS.
Users can start chatting with the LLM by sending an API request that invokes the Lambda function to generate a response. The Lambda function runs a Docker container with Amazon ECR.
The Lambda function stores the embedded messages in Amazon DynamoDB
This Lambda function uses RAG architecture and LangChain to retrieve the response from LLMs hosted on Amazon Bedrock augmented with the patient's information stored in the Amazon RDS. This project uses Meta Llama 3 70 B as its selected LLM.
The LLM responses are streamed as chunks to the frontend using AppSync as a websocket implementation, so that the frontend may receive realtime responses as they are generated rather than receiving all of the generated text at once.
An ECS server invokes Amazon Nova Sonic through bedrock and delivers chunks of audio to and from the backend via CloudFront and a websocket connection.

Database Schema

RDS Langchain Tables

`langchain_pg_collection` table

Column Name	Description
`uuid`	The uuid of the collection
`name`	The name of the collection
`cmetadata`	The metadata of the collection

`langchain_pg_embedding` table

Column Name	Description
`id`	The ID of the embeddings
`collection_id`	The uuid of the collection
`embedding`	The vector embeddings of the document
`cmetadata`	The metadata of the collection
`document`	The content of the document

RDS PostgreSQL Tables

`users` table

Column Name	Description
`user_id`	The ID of the user
`user_email`	The email of the user
`username`	The username of the user
`first_name`	The first name of the user
`last_name`	The last name of the user
`preferred_name`	The preferred name of the user
`time_account_created`	The time the account was created in UTC
`roles`	The roles of the user
`last_sign_in`	The time the user last signed in in UTC

`simulation_groups` table

Column Name	Description
`simulation_group_id`	The ID of the simulation group
`group_name`	The name of the simulation group
`group_description`	The description of the simulation group
`group_access_code`	The access code for students to join the group
`group_student_access`	Whether or not students can access the group
`system_prompt`	The system prompt for the group

`enrolments` table

Column Name	Description
`enrolment_id`	The ID of the enrolment
`user_id`	The ID of the enrolled user
`simulation_group_id`	The ID of the associated simulation group
`enrolment_type`	The type of enrolment (e.g., student, instructor, admin)
`group_completion_percentage`	The percentage of the group completed (currently unused)
`time_enroled`	The timestamp when the enrolment occurred

`patients` table

Column Name	Description
`patient_id`	The ID of the patient
`simulation_group_id`	The ID of the associated simulation group
`patient_name`	The name of the patient
`patient_age`	The age of the patient
`patient_gender`	The gender of the patient
`patient_number`	The number of the patient
`patient_prompt`	The prompt that reveals more about patient
`llm_completion`	The name of the patient

`patient_data` table

Column Name	Description
`file_id`	The ID of the file
`patient_id`	The ID of the associated patient
`filetype`	The type of the file (e.g., pdf, docx, etc.)
`s3_bucket_reference`	The reference to the S3 bucket
`filepath`	The path to the file in the S3 bucket
`filename`	The name of the file
`time_uploaded`	The timestamp when the file was uploaded
`metadata`	Additional metadata about the file
`file_number`	Number of the file to keep track of order

`student_interactions` table

Column Name	Description
`student_interaction_id`	The ID of the student interaction
`patient_id`	The ID of the associated patient
`enrolment_id`	The ID of the related enrolment
`patient_score`	Score calculated by the LLM for the student interacting with the patient
`last_accessed`	The timestamp of the last time the patient was accessed
`patient_context_embedding`	A float array representing the patient context embedding
`is_completed`	A Boolean representing if the instructor has marked this student's interaction as complete

`sessions` table

Column Name	Description
`session_id`	The ID of the session
`student_interaction_id`	The ID of the associated student interaction
`session_name`	The name of the session
`session_context_embeddings`	A float array representing the session context embeddings (currently unused)
`last_accessed`	The timestamp of the last time the session was accessed
`notes`	The notes a student can take per session when talking to a patient

`messages` table

Column Name	Description
`message_id`	The ID of the message
`session_id`	The ID of the associated session
`student_sent`	Whether the message was sent by the student (boolean)
`message_content`	The content of the message (currently unused)
`time_sent`	The timestamp when the message was sent

`user_engagement_log` table

Column Name	Description
`log_id`	The ID of the engagement log entry
`user_id`	The ID of the user
`simulation_group_id`	The ID of the associated simulation group
`patient_id`	The ID of the associated patient
`enrolment_id`	The ID of the related enrolment
`timestamp`	The timestamp of the engagement event
`engagement_type`	The type of engagement (e.g., patient access)
`engagement_details`	The text describing the engagement

S3 Bucket Structure

.
├── {simulation_group_id}
    └── {patient_id}
        └── documents
            ├── document_1.pdf
            └── document_2.pdf
        └── info
            ├── info_1.pdf
            └── info_2.pdf
        └── answer_key
            ├── answer_key_1.pdf
        └── profile_pic
            ├── {patient_id}_profile_pic.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture Deep Dive

Architecture

Description

Database Schema

RDS Langchain Tables

`langchain_pg_collection` table

`langchain_pg_embedding` table

RDS PostgreSQL Tables

`users` table

`simulation_groups` table

`enrolments` table

`patients` table

`patient_data` table

`student_interactions` table

`sessions` table

`messages` table

`user_engagement_log` table

S3 Bucket Structure

FilesExpand file tree

architectureDeepDive.md

Latest commit

History

architectureDeepDive.md

File metadata and controls

Architecture Deep Dive

Architecture

Description

Database Schema

RDS Langchain Tables

langchain_pg_collection table

langchain_pg_embedding table

RDS PostgreSQL Tables

users table

simulation_groups table

enrolments table

patients table

patient_data table

student_interactions table

sessions table

messages table

user_engagement_log table

S3 Bucket Structure

`langchain_pg_collection` table

`langchain_pg_embedding` table

`users` table

`simulation_groups` table

`enrolments` table

`patients` table

`patient_data` table

`student_interactions` table

`sessions` table

`messages` table

`user_engagement_log` table