Skip to content
Feature

Auto captions with animated styles for gaming clips

Word-timed animated captions baked into your video. Deepgram transcription with vocal separation, karaoke-style highlighting, and preset styles that match gaming content.

Captions built for short-form video

Not just subtitles. Animated, styled, and timed to keep viewers watching.

Word-timed accuracy

Powered by Deepgram nova-3, captions are synced to individual words. Each word appears at the exact millisecond it’s spoken, with precise start and end timestamps.

Animated entrance

Each phrase enters the screen with an animation: pop (scale up), bounce (spring), fade (opacity), or scale (grow in). Configurable per preset.

Active word highlighting

Karaoke-style highlighting colors the current word as it’s spoken. Scale bump on the active word draws the eye. Viewers follow along effortlessly.

Caption presets

Choose from multiple presets with different fonts, colors, stroke widths, highlight colors, and animation types. Pick a style that matches your brand.

Stroke and shadow

Text stroke ensures readability over any background. Shadow and outline settings are tuned per preset so captions look clean on bright gameplay or dark scenes.

16+ languages

Deepgram supports English, Spanish, French, German, Portuguese, Japanese, Korean, Chinese, and more. Captions work regardless of what language you stream in.

Animation styles

Each phrase enters the screen with your chosen animation. Pick the vibe that fits your content.

Pop

Words scale up from zero with a spring ease. Punchy and attention-grabbing.

Bounce

Spring-based entrance with overshoot. Playful and energetic.

Fade

Smooth opacity transition. Clean and professional.

Scale

Gradual size increase. Subtle and modern.

From speech to animated text

  1. 1

    Vocal separation

    Demucs isolates your voice from game audio, music, and effects. Clean audio means accurate transcription.

  2. 2

    Speech recognition

    Deepgram nova-3 transcribes the isolated voice track with word-level timestamps in 16+ languages.

  3. 3

    Phrase grouping

    Words are grouped into 3-word phrases. Gaps longer than 0.8 seconds create natural phrase breaks.

  4. 4

    Animated rendering

    Remotion renders each phrase with entrance animation and active word highlighting, using your chosen preset style.

Register and get 10 credits per week

No credit card required. Start clipping in minutes.

Get started free

Auto captions FAQ

How Clippper generates animated captions for your clips.

How are captions generated?
Clippper uses Deepgram nova-3 for speech-to-text with word-level timestamps. Audio goes through Demucs vocal separation first to isolate speech from game audio. The result is captions synced to individual words.
Are captions baked into the video?
Yes. Captions are rendered directly into the MP4 by Remotion. They’re part of the video file, so they appear on every platform without relying on platform-specific caption support.
Can I choose different caption styles?
Yes. Clippper offers multiple caption presets with different fonts, colors, stroke styles, and animation types. Each preset is designed to be readable over gaming content.
What languages are supported?
Deepgram nova-3 supports 16+ languages including English, Spanish, French, German, Portuguese, Japanese, Korean, Chinese, Dutch, Italian, and more.
How does active word highlighting work?
As each word is spoken, it gets highlighted with a different color and a slight scale bump. This karaoke-style effect helps viewers follow along, especially when watching without sound.
What about accuracy with game audio in the background?
Clippper runs Demucs vocal separation before transcription. This strips out game sounds, music, and effects, leaving clean voice audio for Deepgram to transcribe. The result is significantly more accurate than transcribing raw stream audio.
Can I disable captions?
The first rendering pass (layout preview) doesn’t include captions. The export pass bakes them in. If you need a version without captions, the preview version is available.