Skip to content

minhnv0807/Auto-Create-Video

Β 
Β 

Repository files navigation

🎬 Auto News Video

Tα»± Δ‘α»™ng tαΊ‘o video tin tα»©c ngαΊ―n 9:16 (~60s) tiαΊΏng Việt cho TikTok / YouTube Shorts / Instagram Reels tα»« URL bΓ i bΓ‘o hoαΊ·c file .txt.

Auto-generate Vietnamese 9:16 short news videos (~60s) for TikTok / YouTube Shorts / Instagram Reels from a news URL or .txt file.

Tests Node License


πŸ‡»πŸ‡³ TiαΊΏng Việt

Giα»›i thiệu

Auto News Video lΓ  mα»™t dα»± Γ‘n mΓ£ nguα»“n mở giΓΊp bαΊ‘n biαΊΏn bαΊ₯t kα»³ bΓ i bΓ‘o cΓ΄ng nghệ tiαΊΏng Việt nΓ o thΓ nh mα»™t video ngαΊ―n motion-graphic chuyΓͺn nghiệp chỉ vα»›i 1 lệnh duy nhαΊ₯t trong Claude Code.

Pipeline tα»± Δ‘α»™ng lΓ m cΓ‘c bΖ°α»›c:

  1. Đọc URL bΓ i bΓ‘o (hoαΊ·c file .txt) vΓ  phΓ’n tΓ­ch nα»™i dung
  2. Sinh kα»‹ch bαΊ£n JSON vα»›i 6 loαΊ‘i template visual khΓ‘c nhau (hook, comparison, stat-hero, feature-list, callout, outro) β€” chọn theo nα»™i dung bΓ i viαΊΏt
  3. Tα»•ng hợp giọng đọc tiαΊΏng Việt qua LucyLab hoαΊ·c ElevenLabs
  4. Render video MP4 vα»›i HyperFrames (Puppeteer + GSAP + FFmpeg) β€” phong cΓ‘ch studio shell + animation hiện Δ‘αΊ‘i
  5. XuαΊ₯t kΓ¨m script.txt vΓ  voice.mp3 để bαΊ‘n import vΓ o CapCut Pro thΓͺm caption / nhαΊ‘c nền

🎯 Tẑi sao project này?

  • βœ… Phong cΓ‘ch HeyGen-quality: persistent brand shell (icon, channel, handle), grain texture, gradient navy + cyan + purple
  • βœ… 6 loαΊ‘i scene template tα»± pick theo nα»™i dung β€” khΓ΄ng rαΊ­p khuΓ΄n
  • βœ… Đa nhΓ  cung cαΊ₯p TTS: LucyLab (giọng Việt tα»± nhiΓͺn + SRT free) hoαΊ·c ElevenLabs (Δ‘a ngΓ΄n ngα»―, nhiều voice library)
  • βœ… TΓ­ch hợp Claude Code skill β€” chỉ cαΊ§n /create-news-video <url> lΓ  xong
  • βœ… Mở rα»™ng được: schema rΓ΅ rΓ ng, code modular, cΓ³ test suite

πŸ› οΈ CΓ΄ng nghệ & thΖ° viện sα»­ dα»₯ng

Lα»›p CΓ΄ng nghệ
Runtime Node.js β‰₯ 22, TypeScript 5+, ESM
Render engine HyperFrames (Puppeteer + GSAP + FFmpeg)
TTS providers LucyLab.io (JSON-RPC, Vietnamese cloning) hoαΊ·c ElevenLabs (REST, multilingual)
Validation Zod (discriminated union schema)
HTTP axios + nock (mocking)
Testing Vitest
Audio FFmpeg + ffprobe (mix, concat vα»›i silence)
AI/Skill Claude Code skill (/create-news-video)
Visual blocks HyperFrames registry: grain-overlay, shimmer-sweep, tiktok-follow
Fonts Inter + Anton (Google Fonts)

πŸ”¬ GiαΊ£i thΓ­ch chi tiαΊΏt cΓ‘c cΓ΄ng nghệ chΓ­nh

🎞️ HyperFrames β€” trΓ‘i tim cα»§a render engine

HyperFrames lΓ  framework HTML-to-video do HeyGen phΓ‘t triển vΓ  mΓ£ nguα»“n mở. KhΓ‘c vα»›i cΓ‘ch dΓΉng After Effects hay Premiere thα»§ cΓ΄ng, HyperFrames cho phΓ©p bαΊ‘n viαΊΏt video bαΊ±ng HTML/CSS/JS rα»“i render thΓ nh MP4 chαΊ₯t lượng cao mα»™t cΓ‘ch deterministic (cΓΉng input β†’ cΓΉng output frame-by-frame).

CΓ‘ch nΓ³ hoαΊ‘t Δ‘α»™ng trong dα»± Γ‘n:

  1. Pipeline sinh ra mα»™t file index.html chα»©a toΓ n bα»™ scenes + GSAP timeline
  2. HyperFrames spawn headless Chrome (Puppeteer) để load file Δ‘Γ³
  3. Capture tα»«ng frame ở Δ‘ΓΊng timestamp (30fps Γ— 60s = 1800 frames)
  4. Encode tαΊ₯t cαΊ£ frames + audio thΓ nh MP4 dΓΉng FFmpeg

Tẑi sao chọn HyperFrames?

  • βœ… CΓ³ sαΊ΅n 50+ pre-built blocks trong registry (transitions, social cards, data viz, kinetic typography...) β€” dΓΉng npx hyperframes add <name>
  • βœ… GSAP timeline Δ‘Γ£ được tΓ­ch hợp sαΊ΅n cho animations mượt mΓ 
  • βœ… Skill-friendly cho AI agent β€” Claude/GPT cΓ³ thể tα»± sinh composition HTML
  • βœ… Lint built-in (npx hyperframes lint) phΓ‘t hiện lα»—i composition trΖ°α»›c khi render
  • βœ… Aspect ratio 9:16 native β€” sinh ra cho short-form video

CΓ‘c blocks/components dΓΉng trong dα»± Γ‘n:

  • grain-overlay β€” film grain texture xuyΓͺn video (cαΊ£m giΓ‘c "analog warmth")
  • shimmer-sweep β€” light pass animation cho text headline
  • tiktok-follow β€” outro CTA card (Δ‘Γ£ sαΊ΅n 1080Γ—1920)

🎀 LucyLab vs ElevenLabs β€” chọn cΓ‘i nΓ o?

TiΓͺu chΓ­ LucyLab ElevenLabs
Giọng tiαΊΏng Việt ⭐⭐⭐⭐⭐ Tα»± nhiΓͺn (voice cloning) ⭐⭐⭐⭐ Tα»‘t (multilingual)
Chi phí Rẻ (~25k VND / 1M ký tự) Đắt hƑn (~$5 / 30k ký tự)
Voice library Tự clone giọng 1000+ voices có sạn
API style JSON-RPC async (poll) REST sync (instant)
SRT subtitle βœ… Free, kΓ¨m theo response ❌ KhΓ΄ng cΓ³
Concurrency 1 export/account Parallel OK
Languages khΓ‘c ❌ Chỉ tiαΊΏng Việt βœ… 30+ ngΓ΄n ngα»―

KhuyαΊΏn nghα»‹:

  • πŸ‡»πŸ‡³ Chỉ lΓ m video tiαΊΏng Việt β†’ chọn LucyLab (rαΊ» + giọng tα»± nhiΓͺn + cΓ³ SRT)
  • 🌍 LΓ m Δ‘a ngΓ΄n ngα»― hoαΊ·c cαΊ§n voice library lα»›n β†’ chọn ElevenLabs
  • πŸ”„ KhΓ΄ng chαΊ―c β†’ bαΊ―t Δ‘αΊ§u vα»›i LucyLab, Δ‘α»•i sang ElevenLabs sau (chỉ cαΊ§n Δ‘α»•i TTS_PROVIDER trong .env.local)

πŸ›‘οΈ Zod β€” schema validation an toΓ n

Zod lΓ  TypeScript-first schema library. Trong project nΓ y, Zod Δ‘αΊ£m bαΊ£o script.json (do Claude sinh) luΓ΄n Δ‘ΓΊng cαΊ₯u trΓΊc trΖ°α»›c khi pipeline chαΊ‘y.

// Discriminated union: 6 loαΊ‘i template, mα»—i loαΊ‘i cΓ³ data shape khΓ‘c nhau
const TemplateData = z.discriminatedUnion("template", [
  HookData, ComparisonData, StatHeroData, FeatureListData, CalloutData, OutroData
]);

Lợi ích:

  • PhΓ‘t hiện ngay nαΊΏu Claude sinh script sai (vd: template: "stat" khΓ΄ng tα»“n tαΊ‘i) β€” fail Step 1 vα»›i error message rΓ΅ rΓ ng
  • TypeScript types được suy ra tα»± Δ‘α»™ng tα»« Zod schema β†’ composer khΓ΄ng cαΊ§n khai bΓ‘o type lαΊ‘i
  • Schema = source of truth cho cαΊ£ validation runtime + type compile-time

βš™οΈ Claude Code Skill β€” interface vα»›i AI

Project tΓ­ch hợp vα»›i Claude Code qua skill markdown Δ‘αΊ·t tαΊ‘i .claude/skills/create-news-video/SKILL.md. Skill nΓ y hΖ°α»›ng dαΊ«n Claude:

  1. WebFetch URL bΓ i bΓ‘o
  2. PhΓ’n tΓ­ch nα»™i dung tiαΊΏng Việt
  3. Pick template phΓΉ hợp cho tα»«ng scene (comparison nαΊΏu cΓ³ "vs", stat-hero nαΊΏu cΓ³ sα»‘ liệu...)
  4. Sinh script.json Δ‘ΓΊng schema
  5. Run pipeline qua Bash

Ζ―u Δ‘iểm: bαΊ‘n chỉ cαΊ§n gΓ΅ /create-news-video <url> β€” Claude tα»± lΓ m hαΊΏt phαΊ§n "creative" (viαΊΏt kα»‹ch bαΊ£n tiαΊΏng Việt, chọn template, viαΊΏt cΓ’u hook hαΊ₯p dαΊ«n). PhαΊ§n "deterministic" (gọi API, render) do Node CLI lo.

πŸ§ͺ Vitest β€” testing framework hiện Δ‘αΊ‘i

Vitest (ESM-native, replacement cho Jest) cho 35 unit tests:

  • Schema validation tests vα»›i fixtures (valid + invalid scripts)
  • TTS client tests vα»›i nock mock HTTP (khΓ΄ng gọi API thαΊ­t, khΓ΄ng tα»‘n quota)
  • Audio tools tests vα»›i fixture mp3 files (440Hz/2s, 880Hz/3s sine waves)
  • HTML composer snapshot test β€” Δ‘αΊ£m bαΊ£o output HTML khΓ΄ng bα»‹ break khi refactor

ChαΊ‘y npm test để verify mọi thα»© work trΖ°α»›c khi push.

🎬 FFmpeg β€” backbone audio/video

FFmpeg + ffprobe được dΓΉng để:

  • ffprobe: Δ‘o duration cα»§a mp3 tα»«ng scene (để compute timing trong composition)
  • ffmpeg: concat cΓ‘c scene mp3 vα»›i 0.3s silence gap β†’ voice.mp3 cuα»‘i
  • ffmpeg (qua HyperFrames): encode 1800 frame PNG + audio thΓ nh MP4

PhαΊ£i cΓ³ trong PATH (ffmpeg -version). TrΓͺn Windows: winget install Gyan.FFmpeg.

πŸ“‹ YΓͺu cαΊ§u hệ thα»‘ng

Mα»₯c PhiΓͺn bαΊ£n Ghi chΓΊ
Node.js β‰₯ 22 node --version
FFmpeg + ffprobe bαΊ₯t kα»³ phiΓͺn bαΊ£n hiện Δ‘αΊ‘i nΓ o trong PATH (ffmpeg -version)
Chrome / Chromium bαΊ₯t kα»³ HyperFrames Puppeteer cαΊ§n β€” sαΊ½ auto-download lαΊ§n Δ‘αΊ§u chαΊ‘y
Claude Code CLI latest cΓ i tαΊ‘i Δ‘Γ’y
TΓ i khoαΊ£n TTS mα»™t trong hai LucyLab.io HOαΊΆC ElevenLabs

πŸš€ CΓ i Δ‘αΊ·t (1 lαΊ§n)

# 1. Clone repo
git clone <repo-url> auto_create_video
cd auto_create_video

# 2. CΓ i dependencies
npm install

# 3. TαΊ‘o file env vΓ  Δ‘iền API key
cp .env.example .env.local
# β†’ mở .env.local, chọn TTS provider (lucylab hoαΊ·c elevenlabs) vΓ  Δ‘iền key

# 4. Verify cΓ i Δ‘αΊ·t
node --version       # β‰₯ 22
ffmpeg -version      # in version OK
ffprobe -version
npm test             # all tests pass (35 tests)

πŸ”‘ CαΊ₯u hΓ¬nh API key

Mở .env.local vΓ  chọn mα»™t trong hai provider:

Option 1: LucyLab.io (khuyαΊΏn nghα»‹ cho tiαΊΏng Việt)

  • Đăng kΓ½ tαΊ‘i https://lucylab.io
  • LαΊ₯y API key + voice ID (UUID 22 kΓ½ tα»±)
  • Đặt TTS_PROVIDER=lucylab
  • βœ… Ζ―u Δ‘iểm: giọng Việt tα»± nhiΓͺn (voice cloning), trαΊ£ kΓ¨m file SRT subtitle miα»…n phΓ­
  • ⚠️ HαΊ‘n chαΊΏ: chỉ 1 export/account Δ‘α»“ng thời (pipeline tα»± xα»­ lΓ½)
TTS_PROVIDER=lucylab
VIETNAMESE_API_KEY=sk_live_xxxxxxxxxxxxxxxxxxxx
VIETNAMESE_VOICEID=22charvoiceiduuidhere

Option 2: ElevenLabs

TTS_PROVIDER=elevenlabs
ELEVENLABS_API_KEY=sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ELEVENLABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
ELEVENLABS_MODEL_ID=eleven_multilingual_v2

🎡 CαΊ₯u hΓ¬nh TikTok follow card (outro)

Mα»—i video tα»± Δ‘α»™ng kαΊΏt thΓΊc vα»›i mα»™t TikTok follow card (slide tα»« dΖ°α»›i lΓͺn + animation click follow) β€” chuαΊ©n HyperFrames style. TαΊ₯t cαΊ£ lΓ  tΓΉy chọn β€” defaults work out of the box:

TIKTOK_DISPLAY_NAME=CΓ΄ng nghệ 24h
TIKTOK_HANDLE=@congnghe24h
TIKTOK_FOLLOWERS=1.2M followers

# TΓΉy chọn: URL αΊ£nh avatar TikTok thαΊ­t cα»§a bαΊ‘n (jpg/png, vuΓ΄ng, β‰₯256x256)
# NαΊΏu khΓ΄ng set β†’ dΓΉng default `assets/avatar.jpg` Δ‘Γ£ bundled
TIKTOK_AVATAR_URL=https://example.com/your-avatar.jpg

CΓ‘ch thay avatar:

  • CΓ‘ch 1 (Δ‘Ζ‘n giαΊ£n): thay file assets/avatar.jpg bαΊ±ng αΊ£nh cα»§a bαΊ‘n (square, ~256x256+)
  • CΓ‘ch 2 (URL): set TIKTOK_AVATAR_URL trong .env.local β†’ pipeline tα»± download mα»—i lαΊ§n render

Card xuαΊ₯t hiện ở giΓ’y thα»© ~1.6 trong scene outro vα»›i chuα»—i animation:

  1. Card slide tα»« bottom lΓͺn (0.5s)
  2. Hold ~0.9s để người xem đọc
  3. Button "Follow" press-in + chuyển sang "Following βœ“" vα»›i mΓ u chuyển tα»« đỏ β†’ xΓ‘m Δ‘en
  4. Card stay visible Δ‘αΊΏn hαΊΏt video

πŸ”Š Sound Effects (SFX) β€” tα»± Δ‘α»™ng mix theo template

Mα»—i video tα»± cΓ³ sound effect mix layer vΓ o voice (volume thαΊ₯p, khΓ΄ng lαΊ₯n voice). KhΓ΄ng phαΊ£i random β€” pipeline pick theo loαΊ‘i template:

Template Default SFX Khi nΓ o nghe
hook transition/whoosh-soft Đầu video, entrance dramatic
comparison transition/swoosh Khi 2 cards xuαΊ₯t hiện
stat-hero emphasis/ding LΓΊc sα»‘/% xuαΊ₯t hiện
feature-list transition/pop Mα»—i bullet appear
callout alert/notification Statement quan trọng
outro outro/tada Ending signature

Library sounds Δ‘Γ£ sαΊ΅n trong assets/sfx/ (download tα»« myinstants.com, royalty-free use):

assets/sfx/
β”œβ”€β”€ transition/  (whoosh-soft, swoosh, pop)
β”œβ”€β”€ emphasis/    (ding, tick, chime)
β”œβ”€β”€ alert/       (notification)
└── outro/       (tada)

Tα»± thΓͺm SFX cα»§a bαΊ‘n:

  1. Download mp3 tα»« myinstants.com hoαΊ·c pixabay sound effects
  2. Đặt vào folder phù hợp assets/sfx/<category>/<name>.mp3
  3. Reference trong script.json: "sfx": { "name": "transition/your-sound", "volume": 0.4 }

Smart override theo nα»™i dung (Claude tα»± pick khi sinh script):

  • "cαΊ£nh bΓ‘o", "rα»§i ro" β†’ alert/notification
  • "vượt", "kα»· lα»₯c", "xuαΊ₯t sαΊ―c" β†’ emphasis/chime
  • Disable cho scene Δ‘Γ³ β†’ "sfx": { "name": "none" }

🎬 Sα»­ dα»₯ng

CΓ‘ch 1: Trong Claude Code (khuyαΊΏn nghα»‹)

Mở Claude Code trong thΖ° mα»₯c project vΓ  gΓ΅:

/create-news-video https://vnexpress.net/iphone-17-200mp

HoαΊ·c vα»›i file .txt:

/create-news-video news/my-article.txt

Sau ~3-5 phΓΊt (TTS + render):

βœ“ Video:  output/<slug>-<timestamp>/video.mp4    ← video cuα»‘i
βœ“ Audio:  output/<slug>-<timestamp>/voice.mp3    ← để import CapCut
βœ“ Script: output/<slug>-<timestamp>/script.txt   ← cho CapCut auto-caption

CΓ‘ch 2: ChαΊ‘y pipeline trα»±c tiαΊΏp (advanced)

NαΊΏu Δ‘Γ£ cΓ³ sαΊ΅n script.json (vd để debug hoαΊ·c tα»± viαΊΏt kα»‹ch bαΊ£n):

npm run pipeline -- output/<slug>-<timestamp>/script.json

CΓ‘ch 3: Re-render lαΊ‘i video khΓ΄ng cαΊ§n TTS (tiαΊΏt kiệm quota)

NαΊΏu Δ‘Γ£ cΓ³ voice files trong voice/ vΓ  muα»‘n render lαΊ‘i visual:

npm run rerender -- output/<slug>-<timestamp>

πŸ“ CαΊ₯u trΓΊc output

output/<slug>-<timestamp>/
β”œβ”€β”€ script.json           # Input JSON (Claude sinh hoαΊ·c bαΊ‘n viαΊΏt tay)
β”œβ”€β”€ script.txt            # Plain text cho CapCut auto-caption
β”œβ”€β”€ images/bg.jpg         # og:image Δ‘Γ£ tαΊ£i (nαΊΏu cΓ³)
β”œβ”€β”€ voice/
β”‚   β”œβ”€β”€ scene-hook.mp3    # TTS tα»«ng scene
β”‚   β”œβ”€β”€ scene-hook.srt    # SRT subtitle (chỉ LucyLab)
β”‚   └── scene-body-1.mp3
β”œβ”€β”€ voice.mp3             # Voice Δ‘Γ£ concat (cho CapCut)
β”œβ”€β”€ index.html            # HyperFrames composition
β”œβ”€β”€ styles.css            # CSS (copied tα»« template)
β”œβ”€β”€ animations.js         # GSAP timeline (copied)
β”œβ”€β”€ hyperframes.json      # HyperFrames project config
β”œβ”€β”€ meta.json             # HyperFrames metadata
└── video.mp4             # πŸŽ‰ Output cuα»‘i β€” 1080Γ—1920 MP4

🎨 Visual System v2

Mα»—i video gα»“m:

  • Persistent shell xuyΓͺn suα»‘t (header brand >_ icon + tΓͺn channel + tag, footer handle TikTok, grain texture, gradient navy)
  • 5–8 scene vα»›i template được Claude pick theo nα»™i dung:
Template Khi nΓ o dΓΉng VΓ­ dα»₯
hook Scene Δ‘αΊ§u tiΓͺn (3-5s) "GPT 5.5" + "AI mαΊ‘nh nhαΊ₯t!" trΓͺn αΊ£nh og:image vα»›i shimmer
comparison Khi cΓ³ "X vs Y" / "vượt xa" / "so vα»›i" 2 cards: "GPT 5.4 75.1%" cyan vs "GPT 5.5 82.7%" purple (winner)
stat-hero Khi cΓ³ sα»‘/% nα»•i bαΊ­t "1M" giant gradient + "Tokens / cα»­a sα»• ngα»― cαΊ£nh"
feature-list Liệt kΓͺ tΓ­nh nΔƒng Card cΓ³ 4 bullets dot cyan glow
callout Statement / cαΊ£nh bΓ‘o / quote Glow purple card vα»›i "CαΊ£nh bΓ‘o: AI tα»± chα»§ cαΊ§n cΓ’n nhαΊ―c"
outro Scene cuα»‘i (3-5s) "Theo dΓ΅i ngay" pill + "CΓ΄ng nghệ 24h" giant + underline gradient

πŸ§ͺ Testing

npm test                 # chαΊ‘y 35 unit tests
npm run test:watch       # watch mode
npx tsc --noEmit         # type-check khΓ΄ng build

πŸ› Troubleshooting

Lα»—i CΓ‘ch khαΊ―c phα»₯c
Missing VIETNAMESE_API_KEY / Missing ELEVENLABS_API_KEY Kiểm tra .env.local Δ‘Γ£ cΓ³ vΓ  Δ‘ΓΊng TTS_PROVIDER
hyperframes render failed ChαΊ‘y npx hyperframes render --help verify CLI; Chrome cΓ i chΖ°a?
LucyLab polling timeout Tăng LUCYLAB_POLL_TIMEOUT_MS trong .env.local (default 120000ms)
ElevenLabs 401 Invalid API key Verify key trΓͺn dashboard ElevenLabs, paste lαΊ‘i vΓ o .env.local
Tα»•ng duration ngoΓ i [48s, 72s] Re-trigger skill, hoαΊ·c chỉnh script.json viαΊΏt dΓ i/ngαΊ―n hΖ‘n
ffprobe: command not found CΓ i FFmpeg: Windows winget install Gyan.FFmpeg, Mac brew install ffmpeg

πŸ—ΊοΈ Roadmap

  • Caption burned-in (forced alignment vα»›i Whisper)
  • Auto-select background music theo mood
  • Multi-news compilation mode (digest)
  • AI-generated images (Flux/Stable Diffusion khi khΓ΄ng cΓ³ og:image)
  • Auto-upload TikTok / YouTube Shorts / Reels qua API
  • Logo overlay tΓΉy chỉnh
  • Multi-language (English, Chinese)
  • Web UI standalone (khΓ΄ng cαΊ§n Claude Code)

πŸ“œ License

MIT β€” sα»­ dα»₯ng tα»± do, fork tα»± do, Δ‘Γ³ng gΓ³p PR tα»± do.


πŸ‡¬πŸ‡§ English

Introduction

Auto News Video is an open-source project that transforms any Vietnamese tech news article into a professional motion-graphic short video with a single command in Claude Code.

The pipeline automates the following steps:

  1. Reads the article URL (or .txt file) and analyzes the content
  2. Generates a JSON script picking from 6 visual template types (hook, comparison, stat-hero, feature-list, callout, outro) based on content nature
  3. Synthesizes Vietnamese voice via LucyLab or ElevenLabs
  4. Renders MP4 video using HyperFrames (Puppeteer + GSAP + FFmpeg) with studio shell style and modern animation
  5. Exports script.txt and voice.mp3 alongside, ready to import into CapCut Pro for captions / BGM

🎯 Why this project?

  • βœ… HeyGen-quality look: persistent brand shell (icon, channel name, handle), grain texture, navy gradient with cyan + purple accents
  • βœ… 6 scene template types auto-picked by content β€” never monotonous
  • βœ… Multi-provider TTS: LucyLab (natural Vietnamese + free SRT) or ElevenLabs (multilingual, large voice library)
  • βœ… Claude Code skill integration β€” just type /create-news-video <url> and you're done
  • βœ… Extensible: clean schema, modular code, full test suite

πŸ› οΈ Tech Stack & Libraries

Layer Technology
Runtime Node.js β‰₯ 22, TypeScript 5+, ESM
Render engine HyperFrames (Puppeteer + GSAP + FFmpeg)
TTS providers LucyLab.io (JSON-RPC, Vietnamese cloning) or ElevenLabs (REST, multilingual)
Validation Zod (discriminated union schema)
HTTP axios + nock (mocking)
Testing Vitest
Audio FFmpeg + ffprobe (mix, concat with silence)
AI/Skill Claude Code skill (/create-news-video)
Visual blocks HyperFrames registry: grain-overlay, shimmer-sweep, tiktok-follow
Fonts Inter + Anton (Google Fonts)

πŸ”¬ Deep-dive on key technologies

🎞️ HyperFrames β€” heart of the render engine

HyperFrames is an open-source HTML-to-video framework by HeyGen. Unlike traditional editors (After Effects, Premiere), HyperFrames lets you author videos with HTML/CSS/JS then render to high-quality MP4 deterministically (same input β†’ identical output frame-by-frame).

How it works in this project:

  1. Pipeline generates an index.html containing all scenes + GSAP timeline
  2. HyperFrames spawns headless Chrome (Puppeteer) to load it
  3. Captures each frame at the precise timestamp (30fps Γ— 60s = 1800 frames)
  4. Encodes all frames + audio into MP4 via FFmpeg

Why HyperFrames?

  • βœ… 50+ pre-built blocks in registry (transitions, social cards, data viz, kinetic typography...) β€” installable via npx hyperframes add <name>
  • βœ… GSAP timeline built-in for smooth animations
  • βœ… AI-agent friendly β€” Claude/GPT can author compositions in HTML
  • βœ… Built-in lint (npx hyperframes lint) catches composition errors before render
  • βœ… 9:16 native β€” designed for short-form video

Blocks/components used in this project:

  • grain-overlay β€” film grain texture throughout video (analog warmth feel)
  • shimmer-sweep β€” light pass animation for headline text
  • tiktok-follow β€” outro CTA card (already 1080Γ—1920)

🎀 LucyLab vs ElevenLabs β€” which to choose?

Criteria LucyLab ElevenLabs
Vietnamese voice ⭐⭐⭐⭐⭐ Natural (voice cloning) ⭐⭐⭐⭐ Good (multilingual)
Cost Cheap (~$1 / 1M chars) Pricier (~$5 / 30k chars)
Voice library Self-clone voices 1000+ voices ready
API style JSON-RPC async (poll) REST sync (instant)
SRT subtitle βœ… Free, included in response ❌ Not provided
Concurrency 1 export/account Parallel OK
Other languages ❌ Vietnamese only βœ… 30+ languages

Recommendation:

  • πŸ‡»πŸ‡³ Vietnamese-only videos β†’ use LucyLab (cheap + natural + with SRT)
  • 🌍 Multilingual or need large voice library β†’ use ElevenLabs
  • πŸ”„ Not sure β†’ start with LucyLab, switch later (just change TTS_PROVIDER in .env.local)

πŸ›‘οΈ Zod β€” type-safe schema validation

Zod is a TypeScript-first schema library. In this project, Zod ensures script.json (generated by Claude) always has correct structure before pipeline runs.

// Discriminated union: 6 template types, each with different data shape
const TemplateData = z.discriminatedUnion("template", [
  HookData, ComparisonData, StatHeroData, FeatureListData, CalloutData, OutroData
]);

Benefits:

  • Detects immediately if Claude generates wrong script (e.g. template: "stat" doesn't exist) β€” fails Step 1 with clear error
  • TypeScript types are inferred from Zod schema β†’ composer doesn't need to redeclare types
  • Schema = single source of truth for both runtime validation + compile-time types

βš™οΈ Claude Code Skill β€” AI interface

The project integrates with Claude Code via a skill markdown at .claude/skills/create-news-video/SKILL.md. This skill instructs Claude to:

  1. WebFetch the article URL
  2. Analyze Vietnamese content
  3. Pick the right template per scene (comparison if "vs", stat-hero if numbers...)
  4. Generate script.json matching schema
  5. Run pipeline via Bash

Benefit: just type /create-news-video <url> β€” Claude handles all "creative" work (writing Vietnamese script, picking templates, crafting catchy hooks). The "deterministic" parts (API calls, rendering) are handled by Node CLI.

πŸ§ͺ Vitest β€” modern testing framework

Vitest (ESM-native Jest replacement) provides 35 unit tests:

  • Schema validation tests with fixtures (valid + invalid scripts)
  • TTS client tests with nock HTTP mocking (no real API calls, no quota wasted)
  • Audio tools tests with fixture mp3 files (440Hz/2s, 880Hz/3s sine waves)
  • HTML composer snapshot test β€” ensures output HTML doesn't break on refactor

Run npm test to verify everything works before pushing.

🎬 FFmpeg β€” audio/video backbone

FFmpeg + ffprobe is used to:

  • ffprobe: measure mp3 duration per scene (to compute timing in composition)
  • ffmpeg: concat scene mp3s with 0.3s silence gap β†’ final voice.mp3
  • ffmpeg (via HyperFrames): encode 1800 PNG frames + audio into MP4

Must be in PATH (ffmpeg -version). On Windows: winget install Gyan.FFmpeg.

πŸ“‹ Prerequisites

Item Version Notes
Node.js β‰₯ 22 node --version
FFmpeg + ffprobe any modern version in PATH (ffmpeg -version)
Chrome / Chromium any required by HyperFrames Puppeteer β€” auto-downloaded on first render
Claude Code CLI latest install here
TTS account one of two LucyLab.io OR ElevenLabs

πŸš€ Setup (one-time)

# 1. Clone the repo
git clone <repo-url> auto_create_video
cd auto_create_video

# 2. Install dependencies
npm install

# 3. Create env file and fill in API keys
cp .env.example .env.local
# β†’ open .env.local, choose TTS provider (lucylab or elevenlabs) and fill key

# 4. Verify installation
node --version       # β‰₯ 22
ffmpeg -version      # any version OK
ffprobe -version
npm test             # all 35 tests should pass

πŸ”‘ API Key Configuration

Open .env.local and pick one of two providers:

Option 1: LucyLab.io (recommended for Vietnamese)

  • Sign up at https://lucylab.io
  • Get API key + voice ID (22-char UUID)
  • Set TTS_PROVIDER=lucylab
  • βœ… Pros: natural Vietnamese voice (cloning), free SRT subtitle file included
  • ⚠️ Cons: only 1 concurrent export per account (pipeline handles this)
TTS_PROVIDER=lucylab
VIETNAMESE_API_KEY=sk_live_xxxxxxxxxxxxxxxxxxxx
VIETNAMESE_VOICEID=22charvoiceiduuidhere

Option 2: ElevenLabs

TTS_PROVIDER=elevenlabs
ELEVENLABS_API_KEY=sk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ELEVENLABS_VOICE_ID=EXAVITQu4vr4xnSDxMaL
ELEVENLABS_MODEL_ID=eleven_multilingual_v2

🎡 TikTok follow card configuration (outro)

Every video automatically ends with a TikTok follow card (slides up from bottom + follow-button click animation) β€” built from the official HyperFrames tiktok-follow block. All fields optional β€” defaults work out of the box:

TIKTOK_DISPLAY_NAME=CΓ΄ng nghệ 24h
TIKTOK_HANDLE=@congnghe24h
TIKTOK_FOLLOWERS=1.2M followers

# Optional: URL to your real TikTok avatar (jpg/png, square, β‰₯256x256)
# If not set β†’ uses bundled default `assets/avatar.jpg`
TIKTOK_AVATAR_URL=https://example.com/your-avatar.jpg

To change the avatar:

  • Option 1 (simple): replace assets/avatar.jpg with your image (square, ~256x256+)
  • Option 2 (URL): set TIKTOK_AVATAR_URL in .env.local β†’ pipeline auto-downloads on every render

Card appears at ~1.6s into the outro scene with this animation sequence:

  1. Card slides up from bottom (0.5s)
  2. Hold ~0.9s for viewer to read
  3. "Follow" button press-in + transitions to "Following βœ“" with redβ†’dark-gray color shift
  4. Card stays visible until end of video

πŸ”Š Sound Effects (SFX) β€” auto-mixed by template

Every video automatically gets a sound effect layer mixed into the voice (low volume, doesn't overpower speech). Not random β€” the pipeline picks based on template type:

Template Default SFX When you hear it
hook transition/whoosh-soft Start of video, dramatic entrance
comparison transition/swoosh When 2 cards appear
stat-hero emphasis/ding When number/% reveals
feature-list transition/pop Each bullet appears
callout alert/notification Important statement
outro outro/tada Ending signature

Bundled sounds in assets/sfx/ (downloaded from myinstants.com):

assets/sfx/
β”œβ”€β”€ transition/  (whoosh-soft, swoosh, pop)
β”œβ”€β”€ emphasis/    (ding, tick, chime)
β”œβ”€β”€ alert/       (notification)
└── outro/       (tada)

Add your own SFX:

  1. Download mp3 from myinstants.com or pixabay sound effects
  2. Drop into assets/sfx/<category>/<name>.mp3
  3. Reference in script.json: "sfx": { "name": "transition/your-sound", "volume": 0.4 }

Smart override by content (Claude auto-picks when generating script):

  • "warning", "risk" β†’ alert/notification
  • "exceed", "record", "outstanding" β†’ emphasis/chime
  • Disable for this scene β†’ "sfx": { "name": "none" }

🎬 Usage

Method 1: Inside Claude Code (recommended)

Open Claude Code in the project directory and type:

/create-news-video https://vnexpress.net/iphone-17-200mp

Or with a .txt file:

/create-news-video news/my-article.txt

After ~3-5 minutes (TTS + render):

βœ“ Video:  output/<slug>-<timestamp>/video.mp4    ← final video
βœ“ Audio:  output/<slug>-<timestamp>/voice.mp3    ← for CapCut
βœ“ Script: output/<slug>-<timestamp>/script.txt   ← for CapCut auto-caption

Method 2: Run pipeline directly (advanced)

If you already have a script.json (e.g. for debugging or hand-written script):

npm run pipeline -- output/<slug>-<timestamp>/script.json

Method 3: Re-render video without re-running TTS (saves quota)

If voice files already exist in voice/ and you only want to re-render visuals:

npm run rerender -- output/<slug>-<timestamp>

πŸ“ Output Structure

output/<slug>-<timestamp>/
β”œβ”€β”€ script.json           # Input JSON (Claude-generated or hand-written)
β”œβ”€β”€ script.txt            # Plain text for CapCut auto-caption
β”œβ”€β”€ images/bg.jpg         # og:image downloaded (if available)
β”œβ”€β”€ voice/
β”‚   β”œβ”€β”€ scene-hook.mp3    # TTS per scene
β”‚   β”œβ”€β”€ scene-hook.srt    # SRT subtitles (LucyLab only)
β”‚   └── scene-body-1.mp3
β”œβ”€β”€ voice.mp3             # Concatenated voice (for CapCut)
β”œβ”€β”€ index.html            # HyperFrames composition
β”œβ”€β”€ styles.css            # CSS (copied from template)
β”œβ”€β”€ animations.js         # GSAP timeline (copied)
β”œβ”€β”€ hyperframes.json      # HyperFrames project config
β”œβ”€β”€ meta.json             # HyperFrames metadata
└── video.mp4             # πŸŽ‰ Final output β€” 1080Γ—1920 MP4

🎨 Visual System v2

Each video consists of:

  • Persistent shell throughout (header brand >_ icon + channel name + tag, footer TikTok handle, grain texture, navy gradient)
  • 5–8 scenes with templates picked by Claude based on content:
Template When to use Example
hook First scene (3-5s) "GPT 5.5" + "AI mαΊ‘nh nhαΊ₯t!" over og:image with shimmer
comparison When content has "X vs Y" / "exceeds" / "compared to" 2 cards: "GPT 5.4 75.1%" cyan vs "GPT 5.5 82.7%" purple (winner)
stat-hero When there's a key number/% "1M" giant gradient + "Tokens / context window"
feature-list When listing features Card with 4 bullets, cyan glow dots
callout Statement / warning / quote Purple glow card with "Warning: agentic AI needs caution"
outro Last scene (3-5s) "Follow now" pill + "CΓ΄ng nghệ 24h" giant + gradient underline

πŸ§ͺ Testing

npm test                 # run 35 unit tests
npm run test:watch       # watch mode
npx tsc --noEmit         # type-check without build

πŸ› Troubleshooting

Error Fix
Missing VIETNAMESE_API_KEY / Missing ELEVENLABS_API_KEY Check .env.local exists and TTS_PROVIDER matches
hyperframes render failed Run npx hyperframes render --help to verify CLI; is Chrome installed?
LucyLab polling timeout Increase LUCYLAB_POLL_TIMEOUT_MS in .env.local (default 120000ms)
ElevenLabs 401 Invalid API key Verify key on ElevenLabs dashboard, re-paste into .env.local
Total duration outside [48s, 72s] Re-trigger skill, or edit script.json to make text longer/shorter
ffprobe: command not found Install FFmpeg: Windows winget install Gyan.FFmpeg, Mac brew install ffmpeg

πŸ—ΊοΈ Roadmap

  • Burned-in captions (forced alignment with Whisper)
  • Auto-select background music by mood
  • Multi-news compilation mode (digest)
  • AI-generated images (Flux/Stable Diffusion when og:image unavailable)
  • Auto-upload TikTok / YouTube Shorts / Reels via API
  • Custom logo overlay
  • Multi-language (English, Chinese)
  • Standalone Web UI (no Claude Code required)

πŸ“œ License

MIT β€” use freely, fork freely, PRs welcome.


🀝 Contributing

Pull requests welcome! For major changes, please open an issue first to discuss what you'd like to change.

# Fork β†’ clone β†’ branch
git checkout -b feature/my-improvement

# Make changes, ensure tests pass
npm test

# Commit (Conventional Commits style)
git commit -m "feat: add Google TTS provider support"

# Push and open PR
git push origin feature/my-improvement

πŸ™ Acknowledgements

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 73.2%
  • CSS 11.9%
  • HTML 7.4%
  • JavaScript 6.7%
  • Go Template 0.8%