Getting Started

What is Video Avatar?

Input a video featuring an avatar and a script (text or audio), and generate a new avatar video with lip-sync to the script.

🌅
Our avatar uses zero-shot technology—no lengthy cloning training required. Just
provide a video for instant lip-syncing.

How It Works

Use an existing avatar

"Existing avatar" includes avatar templates from the public library as well as avatars you have previously created and saved. To use these existing avatars, follow the steps below:

Use List All Avatars to obtain the corresponding avatar ID.
Use Submit Video Avatar Task to create a digital human video synthesis task.
Use Query Video Avatar Task to check the task status and obtain the synthesis result.

Create and use your own avatar

If you want to use your own video as an avatar template, you can directly pass your video’s videoFileId parameter to Submit Video Avatar Task to use your video as the avatar template.

Related APIs

Submit Video Avatar Task – Create a video avatar synthesis task
Query Video Avatar Task – Query task status and get the preview video
List All Avatars – Get the list of available avatars
List All Voices – Get the list of available voiceovers
List All Ethnicities – Get the list of ethnicities

Billing Rules

The price is the same whether you use a public avatar template or upload your own video

Audio-Driven Mode Fees

Charged at 0.02 credits per second based on the length of the generated video.
Billing is calculated in whole seconds; any partial second is rounded up to the next full second.
Text-Driven Mode Fees
In addition to the audio-driven mode fees, an extra TTS (Text-to-Speech) fee is applied: An additional 0.1 credits are charged for every 100 characters, rounded up to the nearest 100 characters.