Skip to content

wigtn/wigtn-timelens

Repository files navigation

TimeLens

AI-powered cultural heritage companion
Brings museum artifacts to life through real-time conversation, image restoration, and interactive discovery.

English · 한국어

Next.js 15 React 19 TypeScript 5 Tailwind CSS 4 Gemini Live API Google ADK GenAI SDK Firebase Cloud Run

Built for the Gemini Live Agent Challenge.

UX Flow

Onboarding

Landing Page    Permission Setup    Overview

Select your language and start → Grant camera & microphone access → "Point your camera at an artifact, and a thousand-year story begins."

Session Start

Session Init    AI Curator Greeting

Gemini Live session initializes with your museum context → AI curator greets with today's live exhibitions via Google Search Grounding

Live Experience

Artifact Recognition    Live Conversation    Restoration

recognize_artifact identifies the Winged Victory in real-time → Voice conversation with historical narration → AI-generated restoration showing the original Hellenistic appearance

Features

  • Live AI Curator — Real-time voice/video conversation powered by Gemini Live API. Point your camera at an artifact and ask questions naturally.
  • Artifact Recognition — Identifies artifacts through the camera and provides historical context, era, civilization, and fun facts.
  • Image Restoration — Generates historically accurate restorations of damaged artifacts using Gemini Flash.
  • Nearby Discovery — Find museums and cultural heritage sites near your location via Google Places API.
  • Visit Diary — Auto-generates illustrated diary entries summarizing your museum visit.
  • Museum-Aware Onboarding — Select your museum before starting; the AI greets you with context about current exhibitions.

Tech Stack

Layer Technology
Frontend Next.js 15, React 19, TypeScript 5, Tailwind CSS 4
AI Gemini Live API, Google ADK, @google/genai
Database Firebase Firestore, Firebase Auth
Maps Google Places API (New), Geolocation API
Deploy Docker, Cloud Run (Seoul), GitHub Actions CI/CD

Getting Started

Prerequisites

  • Node.js 20+ and npm 10+Download
  • Google Chrome (recommended) — Microphone & camera permissions work best on Chrome
  • API keys (see Step 2 below)

Step 1: Clone & Install

git clone https://github.com/wigtn/wigtn-timelens.git
cd wigtn-timelens
npm install

Step 2: Prepare API Keys

Copy the template first:

cp .env.example .env.local

2-1. Gemini API Key (Required)

This is the only key you need to use the core features: voice conversation, artifact recognition, image restoration, and diary generation.

  1. Go to Google AI Studio
  2. Click "Create API Key"
  3. Copy the key into .env.local:
GOOGLE_GENAI_API_KEY=your_gemini_api_key_here

With just this key, you can start the app and use Live AI Curator, Artifact Recognition, Image Restoration, and Visit Diary.

2-2. Firebase Project (Optional — for session persistence)

Without Firebase, the app works fully but session history and diary sharing won't persist across page reloads.

  1. Go to Firebase ConsoleCreate a project (or use existing)
  2. Enable Authentication → Sign-in method → Anonymous → Enable
  3. Enable Cloud Firestore → Create database → Start in test mode
  4. Go to Project SettingsGeneral → scroll to "Your apps" → click Web (</>) → Register app
  5. Copy the config values into .env.local:
NEXT_PUBLIC_FIREBASE_API_KEY=your_firebase_api_key
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.com
NEXT_PUBLIC_FIREBASE_PROJECT_ID=your-project-id

2-3. Google Maps & Places API Keys (Optional — for museum search & nearby discovery)

Without these keys, the "What's nearby?" discovery feature won't work, but all other features remain fully functional.

  1. Go to Google Cloud Console → APIs & Services → Credentials
  2. Create an API key (or use existing)
  3. Enable these APIs in APIs & Services → Library:
    • Maps JavaScript API (for museum map display)
    • Places API (New) (for nearby museum/heritage site search)
  4. Copy the keys into .env.local:
NEXT_PUBLIC_GOOGLE_MAPS_API_KEY=your_maps_api_key
GOOGLE_PLACES_API_KEY=your_places_api_key

Tip: You can use the same API key for both if both APIs are enabled on it.

Final .env.local checklist

# Gemini (Required — powers all AI features)
GOOGLE_GENAI_API_KEY=✅

# Firebase (Optional — session persistence & diary sharing)
NEXT_PUBLIC_FIREBASE_API_KEY=
NEXT_PUBLIC_FIREBASE_AUTH_DOMAIN=
NEXT_PUBLIC_FIREBASE_PROJECT_ID=

# Maps & Places (Optional — museum search & nearby discovery)
NEXT_PUBLIC_GOOGLE_MAPS_API_KEY=
GOOGLE_PLACES_API_KEY=

# App URL (keep default for local dev)
NEXT_PUBLIC_APP_URL=http://localhost:3000

Step 3: Run

npm run dev

Open http://localhost:3000 in Chrome.

Step 4: Use TimeLens

  1. Allow permissions — Grant microphone and camera access when prompted
  2. Select a museum — Pick one from the nearby list, search by name, or skip to start directly
  3. Start a session — The AI curator will greet you with context about current exhibitions
  4. Try these voice commands:
    • "이거 뭐야?" / "What is this?" — Point camera at an artifact
    • "원래 어떻게 생겼어?" / "Show me the original" — Restoration
    • "근처에 박물관 있어?" / "What's nearby?" — Discovery
    • "다이어리 만들어줘" / "Create my diary" — Visit diary

Troubleshooting

Issue Solution
Microphone not working Check Chrome permissions (lock icon in address bar)
Camera black screen Ensure no other app is using the camera
"API key not configured" Verify GOOGLE_GENAI_API_KEY is set in .env.local, then restart npm run dev
Museum search returns empty Places API keys are optional; check that Places API (New) is enabled if you added them
Firebase warnings in console Firebase keys are optional; session data won't persist without them but the app works

Scripts

npm run dev          # Dev server (Turbopack)
npm run build        # Production build
npm start            # Production server
npm run lint         # ESLint
npm run type-check   # TypeScript validation

Architecture

TimeLens Architecture

TimeLens runs on a dual-pipeline architecture:

  • Pipeline 1 — Live Streaming: A persistent WebSocket session with Gemini Live API (gemini-2.5-flash-native-audio). Microphone audio (PCM16, 16kHz) and camera frames (JPEG, 1fps) stream into the model simultaneously. The model responds with real-time voice output and triggers function calls when needed.
  • Pipeline 2 — REST On-Demand: Server-side API routes handle heavier tasks like image generation and external API calls. These are invoked by function calls from Pipeline 1.

Function Calling Workflow

Function Calling Workflow

The Gemini Live Agent uses 4 function declarations to route user intent — no intent classifier needed. The model decides which tool to call based on conversation context:

Tool Trigger Backend Pipeline
recognize_artifact Camera frame detected Gemini Live API + Google Search Grounding In-session (no REST call)
generate_restoration "Show me the original" POST /api/restore → Gemini 2.5 Flash Image REST
discover_nearby "What's nearby?" + GPS GET /api/discover → Google Places API REST
create_diary "Make my diary" POST /api/diary/generate → Gemini 3 Pro Image REST

recognize_artifact is the only tool that stays entirely within the Live session — camera frames are already streaming, so the model analyzes them directly with Google Search Grounding. The other three tools call REST endpoints.

Google GenAI SDK + ADK

TimeLens is built with both @google/genai and @google/adk:

@google/genai (SDK) @google/adk (Agent Development Kit)

Primary path — powers the real-time Live experience

  • GoogleGenAI client for Live API sessions
  • Modality for audio/image streaming
  • Type + Schema for function declarations
  • Image generation (Gemini Flash, Gemini 3 Pro)

10 source files across src/web/, src/back/, src/shared/

Fallback path — text-based agent orchestration

  • LlmAgent for 5 specialist agents
  • FunctionTool for 3 tool integrations
  • InMemoryRunner for agent execution

Agent hierarchy:

timelens_orchestrator
├── curator_agent
├── restoration_agent  → generate_restoration_image
├── discovery_agent    → search_nearby_places
└── diary_agent        → generate_diary
Always active — voice + camera streaming Activates when WebSocket is unavailable

Both paths share the same backend APIs — whether a user speaks or types, they get the same restoration, discovery, and diary capabilities. Run npx tsx scripts/adk-demo.ts to see the ADK agents in action.

REST API Routes

Route Purpose Backend
POST /api/session Create session + ephemeral token Gemini API
GET /api/museums/nearby GPS-based museum search Places API
GET /api/museums/search Text search for museums Places API
POST /api/restore Generate artifact restoration Gemini Flash
GET /api/discover Find nearby heritage sites Places API
POST /api/diary/generate Generate visit diary Gemini + Firestore
GET /api/diary/[id] Retrieve diary Firestore

Project Structure

src/
  app/            # Next.js pages & API routes
  shared/         # Shared types, Gemini tools, configs
  web/            # Client components & hooks
  back/           # Server-side logic (agents, geo, Firebase)
    agents/       # ADK agents (orchestrator + 4 specialists)
    agents/tools/ # FunctionTool implementations
mobile/           # React Native + Expo app
scripts/          # ADK demo script
firebase/         # Firestore & Storage security rules
docs/             # PRDs, design docs
assets/           # Logo, architecture diagrams
.github/          # GitHub Actions CI/CD

Deployment

Deployed to Google Cloud Run (asia-northeast3, Seoul) via GitHub Actions.

# Manual build (optional)
docker build -t timelens .
docker run -p 8080:8080 timelens

Google Cloud Services

Service Purpose
Cloud Run Production deployment (Seoul region)
Firebase Auth Anonymous authentication
Cloud Firestore Session, visit, and diary storage
Google Places API Museum and heritage site search

License

This project is licensed under the Apache License 2.0.

Built for the Gemini Live Agent Challenge hackathon.

About

AI-powered museum curator — real-time voice & camera artifact recognition using Gemini Live API | #GeminiLiveAgentChallenge

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages