Soundscape AI is an application that generates immersive audio environments based on uploaded images. Using AI, the system analyzes images to create tailored soundscapes that match the scene's mood and content.
- /backend - AWS Lambda-based serverless backend
- /frontend - React-based web application
The backend uses AWS serverless technologies to process images and generate audio:
- API Gateway - Handles HTTP requests
- Lambda Functions - Process images and generate audio
- Step Functions - Orchestrates the processing workflow
- DynamoDB - Stores metadata
- S3 - Stores images and audio files
- Amazon Rekognition - Detects objects in images
- Amazon Bedrock (Claude) - Analyzes images and generates text descriptions
- ElevenLabs API - Generates audio based on prompts
- Serverless Architecture - Scalable, pay-per-use model
- Comprehensive Error Logging - Structured logging with detailed context
- Error Handling & Recovery - Graceful error handling with clear user feedback
- Monitoring Dashboard - CloudWatch dashboard for system health monitoring
- Step Function Workflow - Reliable, resumable processing pipeline
- AWS SAM - Infrastructure as code
- Python 3.9 - Lambda runtime
- Boto3 - AWS SDK
- Pillow - Image processing
- Structured Logging - JSON-formatted logs
The frontend provides an intuitive interface for users to:
- Upload images
- View AI-generated descriptions
- Play generated audio
- Track processing status
We've implemented a comprehensive error logging and monitoring system:
- Structured JSON Logging - Consistent, searchable log format
- Context Tracking - Maintains context across the processing pipeline
- Custom Error Classes - Standardized error categorization
- CloudWatch Integration - Metrics, dashboard, and alerts
- Health Monitoring - Enhanced health check endpoint
For more details, see:
- AWS Account
- Node.js 16+ for frontend
- Python 3.9+ for backend local development
- AWS SAM CLI
- ElevenLabs API key
cd backend
sam build
sam deploy --guided
cd frontend
npm install
npm start
This project is proprietary and not licensed for redistribution or use outside of this specific project.
Claude 3.7 Sonnet, Cursor AI, Playboi Carti - OPM BABI
- HooHacks 2025 organizers
- AWS for providing the cloud infrastructure
- ElevenLabs for the audio generation API