GIF caption generator - v0 by Vercel

Objective: Build a basic prototype where a user enters a GIF-theme prompt, provides a YouTube link & uploads a video, and the system automatically generates captioned GIFs based on the content and prompt.

Flow

User Input (Frontend) Prompt: Text input (e.g., “funny moments,” “sad quotes,” “motivational clips”)

Video:

YouTube URL

MP4 file upload

Processing (Backend Logic) Transcription

Service: Groq Whisper API

API Key: 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

Call:

js Copy Edit const response = await fetch('https://api.groq.com/v1/whisper/transcribe', { method: 'POST', headers: { 'Authorization': Bearer ${GROQ_API_KEY}, 'Content-Type': 'application/json' }, body: JSON.stringify({ audio_url: uploadedOrYouTubeAudioUrl, model: 'whisper-large-v3-turbo' }) }); const transcriptData = await response.json(); Output: Full transcript with timestamps.

Key-Line Extraction

Combine the transcript text + user “GIF theme” prompt.

Run a simple NLP/scoring heuristic (e.g., TF-IDF or embeddings similarity) to pick the top 2–3 lines that best match the theme.

Clip Segmentation

For each selected line, pull its start/end timestamps from the transcript.

Use FFmpeg to cut those segments out of the uploaded MP4 (or the downloaded YouTube stream).

Caption Overlay

Render the extracted text onto each clip (e.g., using FFmpeg’s drawtext filter).

Style captions for readability (semi-transparent background, centered, etc.).

GIF Conversion

Convert each captioned clip into a looping GIF (via FFmpeg).

Optimize size (e.g., palette generation & dithering).

Output (Frontend) Display the resulting GIF thumbnails in a grid.

Under each GIF, provide:

Download button (GIF file)

Preview (auto-play on hover)

Notes & Next Steps Environment Variables:

Store GROQ_API_KEY securely (e.g., in .env)

YouTube Download:

Use ytdl-core (Node) or youtube-dl (Python) to fetch MP4 streams.

Error Handling:

Gracefully fall back to YouTube’s auto-captions if the Groq API fails.

and this is the Gemini api key : AIzaSyDNg41kb7s9l-mnzxVGDzrjSzxyzX_Wt_s now create the app