Seedance 1.5

ByteDance · AI video

Seedance 1.5 Pro is ByteDance's advanced AI video generation model, officially released in late 2025. It uses a dual-branch diffusion transformer (DB-DiT) to generate synchronized audio and video in a single pass, with strong lip-sync and professional camera control. The model targets creators, marketers, and production teams who need native audio-visual output without separate dubbing or Foley.

It produces short-form clips with integrated dialogue and sound, cinematic camera moves, and flexible input modes (text-to-video, image-to-video, first-frame, first-last-frame). It is a strong fit for narrative and dialogue-heavy content, ads, and social shorts where audio-visual alignment matters.

ByteDance has emphasized speed improvements over earlier versions—inference can be significantly faster while delivering 720p to 1080p output in seconds. That makes Seedance 1.5 Pro viable for both one-off hero clips and higher-volume production.

Key features and benefits

Native audio-visual generation

Seedance 1.5 Pro generates video and audio together in one pass, with millisecond-precision alignment and accurate lip-sync across multiple languages and dialects. The dual-branch architecture is designed so that audio and video are coherent from the start—mouth movements match speech, and sound effects align with action.

Speed and resolution

The model offers much faster inference than earlier versions, producing 720p to 1080p video in seconds. It supports flexible aspect ratios (16:9, 9:16, 1:1, 4:3, 21:9) and durations from about 2 to 12 seconds. You can match common short-form formats and iterate quickly.

Cinematic camera control

Professional camera moves—pan, tilt, zoom, dolly zoom, tracking—can be executed within a single generation. The model has strong semantic understanding for narrative coherence and color grading, so you can request specific camera language and get consistent, cinematic output.

Multiple input modes

Seedance 1.5 Pro supports text-to-video, image-to-video, first-frame, and first-last-frame input modes, giving you flexibility for different workflows and story structures. First-last-frame is useful when you know the start and end of a shot and want the model to fill the motion in between.

Technical specifications

Resolution720p to 1080p

Aspect ratios16:9, 9:16, 1:1, 4:3, 21:9

Duration2–12 seconds

AudioNative sync, multi-language lip-sync

Input modesText-to-video, image-to-video, first-frame, first-last-frame

Use cases and applications

Seedance 1.5 Pro targets creators, marketers, production teams, and agencies who need professional short-form video with dialogue and cinematic motion: film and TV pre-vis, ads, social content, and narrative pieces. Use it when native lip-sync and integrated audio are requirements—explainers, testimonials, character-driven shorts, and branded stories all benefit.

The camera control and multi-mode support make it a good fit for serial content where you want consistent style and professional movement.

Why this model

Seedance 1.5 Pro stands out for integrated audio-visual generation and camera control. Consider it when dialogue and lip-sync are central to your content. For purely visual or abstract clips, other models may be sufficient; for talking heads and character speech, Seedance 1.5 Pro's native audio is a differentiator.

Pricing · Docs

What you should know

Does Seedance 1.5 Pro generate audio?

Yes. It generates synchronized audio and video in one pass with accurate lip-sync.

What aspect ratios are supported?

Seedance 1.5 Pro supports 16:9, 9:16, 1:1, 4:3, and 21:9.

Where is Seedance documented?

See ByteDance and partner API documentation for authoritative limits, regions, and pricing.

← All AI models