Skip to content

Commit 92d29b4

Browse files
committed
updated documentations
1 parent 740c854 commit 92d29b4

File tree

6 files changed

+511
-323
lines changed

6 files changed

+511
-323
lines changed

ARCHITECTURE.md

Lines changed: 210 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,210 @@
1+
# Nova Sonic Architecture
2+
3+
This document provides a comprehensive overview of the Nova Sonic speech-to-speech application architecture.
4+
5+
## System Overview
6+
7+
Nova Sonic is a speech-to-speech conversation application that enables real-time voice interactions with an AI assistant powered by Amazon Bedrock. The application features WebRTC for low-latency audio streaming, a restaurant booking capability, and a real-time event system using AppSync Events API. As illustrated in Figure 1 below, the system consists of frontend and backend components working together with various AWS services.
8+
9+
![High-level Nova Sonic Architecture](generated-diagrams/nova-sonic-architecture-diagram.png)
10+
*Figure 1: High-level architecture of the Nova Sonic application showing the main components and their interactions.*
11+
12+
## Key Components
13+
14+
The following components are illustrated in the high-level architecture diagram (Figure 1) and interact as shown in the data flow diagram (Figure 2):
15+
16+
### Frontend Web Application
17+
18+
- **Technology**: React-based web application
19+
- **Deployment**: ECS Fargate service with Application Load Balancer
20+
- **Features**:
21+
- WebRTC integration for audio streaming
22+
- Real-time conversation display
23+
- Restaurant booking interface
24+
- AppSync Events API client for real-time updates
25+
26+
### Backend API Service
27+
28+
- **Technology**: Python-based API service
29+
- **Deployment Options**:
30+
- ECS Fargate service (default)
31+
- EC2 instances running Docker containers (alternative)
32+
- **Features**:
33+
- Audio processing and streaming
34+
- Integration with Amazon Bedrock
35+
- DynamoDB interaction for data persistence
36+
- Restaurant booking API integration
37+
- AppSync Events API integration for real-time events
38+
39+
### AWS Services
40+
41+
#### Amazon Bedrock
42+
43+
- Powers the AI conversation capabilities
44+
- Provides Nova Sonic for speech synthesis
45+
- Enables natural language understanding and generation
46+
47+
#### Amazon DynamoDB
48+
49+
- **Tables**:
50+
- `NovaSonicConversations`: Stores conversation history
51+
- `RestaurantBookings`: Stores restaurant booking information
52+
- **Features**:
53+
- DynamoDB Streams for change data capture
54+
- On-demand capacity for cost optimization
55+
56+
#### AWS AppSync Events API
57+
58+
- Provides real-time publish/subscribe functionality
59+
- Enables real-time updates for conversation and booking events
60+
- Integrates with DynamoDB Streams via Lambda function
61+
62+
#### AWS Elastic Container Service (ECS)
63+
64+
- Manages container deployment and orchestration
65+
- Supports both Fargate and EC2 launch types
66+
- Auto-scaling based on CPU utilization
67+
68+
#### AWS Elastic Load Balancing
69+
70+
- **Application Load Balancer (ALB)**:
71+
- Distributes HTTP/HTTPS traffic
72+
- Supports WebSocket connections
73+
- Enables HTTPS with AWS Certificate Manager
74+
- **Network Load Balancer (NLB)**:
75+
- Handles WebRTC UDP traffic
76+
- Enables low-latency audio streaming
77+
78+
#### Amazon Route 53
79+
80+
- DNS management for custom domains
81+
- Integration with AWS Certificate Manager for HTTPS
82+
83+
## Network Architecture
84+
85+
The network architecture, as depicted in Figure 3, includes the following components:
86+
87+
### VPC Configuration
88+
89+
- VPC with public and private subnets across 2 availability zones
90+
- NAT Gateway for outbound internet access from private subnets
91+
- Security groups for fine-grained access control
92+
93+
### Security Groups
94+
95+
- **API Security Group**: Controls access to the API service
96+
- **API Load Balancer Security Group**: Controls access to the API load balancer
97+
- **WebRTC Load Balancer Security Group**: Controls access to the WebRTC load balancer
98+
- **Webapp Security Group**: Controls access to the web application
99+
100+
## Data Flow
101+
102+
The following diagram (Figure 2) illustrates how data flows through the Nova Sonic system during user interactions:
103+
104+
![Nova Sonic Data Flow Diagram](generated-diagrams/nova-sonic-data-flow-diagram.png)
105+
*Figure 2: Data flow diagram illustrating how information moves through the Nova Sonic system during user interactions.*
106+
107+
### Speech-to-Speech Conversation Flow
108+
109+
1. User speaks into the microphone on the web application
110+
2. Audio is streamed via WebRTC to the backend API
111+
3. API sends the audio to Amazon Transcribe for speech-to-text conversion
112+
4. Transcribed text is sent to Amazon Bedrock for processing
113+
5. Bedrock generates a response
114+
6. Response is sent to Amazon Nova Sonic for text-to-speech conversion
115+
7. Synthesized speech is streamed back to the user via WebRTC
116+
8. Conversation history is stored in DynamoDB
117+
9. DynamoDB Streams trigger Lambda function
118+
10. Lambda function publishes events to AppSync Events API
119+
11. Web application receives real-time updates via AppSync subscription
120+
121+
### Restaurant Booking Flow
122+
123+
1. User requests to book a restaurant through voice conversation
124+
2. Bedrock identifies the booking intent and extracts details
125+
3. API sends booking request to the Restaurant Booking API
126+
4. Booking confirmation is stored in DynamoDB
127+
5. DynamoDB Streams trigger Lambda function
128+
6. Lambda function publishes booking event to AppSync Events API
129+
7. Web application receives real-time booking confirmation via AppSync subscription
130+
131+
## Real-time Event System
132+
133+
The AppSync Events API provides real-time publish/subscribe functionality for change data capture from DynamoDB tables. This enables real-time updates for conversation and booking events, as illustrated in the data flow diagram (Figure 2).
134+
135+
### Components
136+
137+
- **AppSync Events API**: AWS AppSync API configured as an EVENTS API
138+
- **DynamoDB Tables with Streams**: Tables with DynamoDB streams enabled
139+
- **Lambda Function**: Processes DynamoDB stream events and publishes to AppSync
140+
- **Client Integration**: Web clients subscribe to AppSync for real-time updates
141+
142+
### Channels
143+
144+
- **restaurant-booking**: Events related to restaurant bookings
145+
- **conversations**: Events related to conversation transcripts
146+
147+
## Deployment Options
148+
149+
Nova Sonic supports multiple deployment options as shown in Figure 3. The detailed architecture diagram below illustrates the infrastructure components and how they interact in different deployment scenarios:
150+
151+
![Nova Sonic Detailed Architecture](generated-diagrams/nova-sonic-detailed-architecture-diagram.png)
152+
*Figure 3: Detailed architecture diagram showing the deployment options and infrastructure components of the Nova Sonic system.*
153+
154+
### ECS-based Deployment (Default)
155+
156+
```
157+
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
158+
│ Application │ │ ECS │ │ EC2 Instance │
159+
│ Load Balancer │────▶│ Service │────▶│ or Fargate │
160+
└───────────────┘ └───────────────┘ └───────────────┘
161+
162+
163+
┌───────────────┐
164+
│ DynamoDB │
165+
└───────────────┘
166+
```
167+
168+
- Managed container orchestration
169+
- Better integration with AWS ecosystem
170+
- More sophisticated deployment options
171+
- Built-in monitoring and logging
172+
173+
### EC2-based Deployment (Alternative)
174+
175+
```
176+
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
177+
│ Application │ │ Auto │ │ EC2 Instance │
178+
│ Load Balancer │────▶│ Scaling Group │────▶│ with Docker │
179+
└───────────────┘ └───────────────┘ └───────────────┘
180+
181+
182+
┌───────────────┐
183+
│ DynamoDB │
184+
└───────────────┘
185+
```
186+
187+
- Simpler architecture with fewer AWS services
188+
- Direct control over the Docker runtime
189+
- Potentially lower costs for certain workloads
190+
- Easier to debug and troubleshoot
191+
192+
## Security Considerations
193+
194+
As shown in the architecture diagrams (Figures 1 and 3), security is implemented at multiple layers of the Nova Sonic system:
195+
196+
- Security groups restrict traffic between components
197+
- IAM roles follow the principle of least privilege
198+
- Containers run in private subnets with outbound internet access through NAT Gateway
199+
- Load balancers are the only components exposed to the internet
200+
- HTTPS is configured for secure communication and to enable WebRTC functionality
201+
- HTTP to HTTPS redirection is implemented for enhanced security
202+
203+
## Cost Optimization
204+
205+
As shown in the deployment architecture (Figure 3), several cost optimization strategies are implemented:
206+
207+
- Auto-scaling is configured to scale based on CPU utilization
208+
- NAT Gateway is shared across availability zones to reduce costs
209+
- DynamoDB is configured with on-demand capacity to optimize costs based on usage
210+
- EC2-based deployment option for potentially lower costs in certain scenarios

0 commit comments

Comments
 (0)