Skip to content

samirk19/soundscapeai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Soundscape AI - HooHacks 2025

Samir Khattak & Emmanuel Kodamanchilly

Soundscape AI is an application that generates immersive audio environments based on uploaded images. Using AI, the system analyzes images to create tailored soundscapes that match the scene's mood and content.

Project Structure

  • /backend - AWS Lambda-based serverless backend
  • /frontend - React-based web application

Backend Architecture

The backend uses AWS serverless technologies to process images and generate audio:

  1. API Gateway - Handles HTTP requests
  2. Lambda Functions - Process images and generate audio
  3. Step Functions - Orchestrates the processing workflow
  4. DynamoDB - Stores metadata
  5. S3 - Stores images and audio files
  6. Amazon Rekognition - Detects objects in images
  7. Amazon Bedrock (Claude) - Analyzes images and generates text descriptions
  8. ElevenLabs API - Generates audio based on prompts

Backend Features

  • Serverless Architecture - Scalable, pay-per-use model
  • Comprehensive Error Logging - Structured logging with detailed context
  • Error Handling & Recovery - Graceful error handling with clear user feedback
  • Monitoring Dashboard - CloudWatch dashboard for system health monitoring
  • Step Function Workflow - Reliable, resumable processing pipeline

Backend Technologies

  • AWS SAM - Infrastructure as code
  • Python 3.9 - Lambda runtime
  • Boto3 - AWS SDK
  • Pillow - Image processing
  • Structured Logging - JSON-formatted logs

Frontend

The frontend provides an intuitive interface for users to:

  1. Upload images
  2. View AI-generated descriptions
  3. Play generated audio
  4. Track processing status

Recent Improvements

Enhanced Error Logging System

We've implemented a comprehensive error logging and monitoring system:

  • Structured JSON Logging - Consistent, searchable log format
  • Context Tracking - Maintains context across the processing pipeline
  • Custom Error Classes - Standardized error categorization
  • CloudWatch Integration - Metrics, dashboard, and alerts
  • Health Monitoring - Enhanced health check endpoint

For more details, see:

Getting Started

Prerequisites

  • AWS Account
  • Node.js 16+ for frontend
  • Python 3.9+ for backend local development
  • AWS SAM CLI
  • ElevenLabs API key

Backend Deployment

cd backend
sam build
sam deploy --guided

Frontend Development

cd frontend
npm install
npm start

License

This project is proprietary and not licensed for redistribution or use outside of this specific project.

Sources

Claude 3.7 Sonnet, Cursor AI, Playboi Carti - OPM BABI

Acknowledgments

  • HooHacks 2025 organizers
  • AWS for providing the cloud infrastructure
  • ElevenLabs for the audio generation API