ByteDance's next-generation AI video model with the revolutionary @-reference system. Combine text, images, video clips, and audio in a single prompt. Native audio-video synchronization, V2V editing, and up to 2K resolution at 30fps — all in one unified generation.
Seedance 2.0 is ByteDance's most advanced AI video generation model, unveiled in February 2026. It adopts a unified multimodal audio-video joint generation architecture supporting 4 input modalities simultaneously — text, up to 9 images, up to 3 video clips, and up to 3 audio tracks. The ground-breaking @-reference system lets you tag specific elements in your prompt and bind them to uploaded references for granular control over camera movement, character appearance, audio rhythm, and visual style. Outputs reach up to 2K resolution with native synchronized audio including multilingual lip-sync, sound effects, and background music.
Revolutionary reference tagging using @Image, @Video, and @Audio labels in your prompt. Bind specific elements to uploaded files for precise control over camera movement, character actions, audio rhythm, and visual style.
Combine text, up to 9 images, up to 3 video clips, and up to 3 audio tracks in a single generation request. Seedance 2.0 is the first model to process all four input types simultaneously.
Joint audio-video synthesis produces lip-sync dialogue, sound effects, and background music synchronized with the visual output. Supports multilingual lip-sync with phoneme-level precision.
Edit existing videos through reference-to-video mode. Transfer motion patterns, camera paths, and pacing from uploaded clips. Change outfits, modify actions, or replace elements while preserving the original structure.
Native 2K (2048x1080) output at 30fps with multiple quality levels: 480p, 720p, and 1080p. Video duration ranges from 4 to 15 seconds per generation.
Upload multiple reference images of the same character from different angles. Seedance 2.0 maintains consistent faces, clothing, body proportions, and accessories across multiple generated clips.
Explore Seedance 2.0's capabilities in multimodal reference control, native audio generation, and video editing

“@Image1 walks through @Image2 with camera movement from @Video1 and background music from @Audio1”
Multi-reference prompt combining all modalities

“@Image1 character dances with rhythm from @Audio1 in @Image3 environment”
Character motion guided by audio beat reference

“A person giving a presentation with synchronized English speech and slide transitions”
Lip-sync dialogue with visual content

“Cooking tutorial with step-by-step narration and ambient kitchen sounds”
Narration synchronized with cooking actions
Seedance 2.0 FAQ
“The @-reference system is genuinely revolutionary. I can extract camera movements from a reference clip and apply them instantly — it's a completely new creative workflow.”
Alex Kim: “The @-reference system is genuinely revolutionary. I can extract camera movements from a reference clip and apply them instantly — it's a completely new creative workflow.”
Priya Sharma: “Native audio sync saves hours of post-production. The lip-sync quality is surprisingly precise even with non-English dialogue.”
Lucas Müller: “V2V editing lets me enhance existing footage without reshooting. Seedance 2.0 is now a core tool in our production pipeline.”
Yuki Tanaka: “The 4-modality input is a game-changer. I can bring a character design, a camera movement reference, and background music all into one prompt and get exactly what I envisioned.”
Experience Seedance 2.0 — the most advanced video generator from ByteDance, free online
10,000+ users