How to Make Synthesia Videos: A Step-by-Step Guide

Written by

Kevin Alster

September 26, 2025

Create AI videos with 230+ avatars in 140+ languages.

Try Free AI Video

Get started

Text Link

I've helped thousands of users create their first Synthesia videos, and I've noticed the same pattern: most people jump straight into the platform without proper planning, then struggle with script timing, avatar selection, or achieving their desired outcome.

In this guide, I'll walk you through the complete process I recommend—from initial concept to final video—based on what actually works for different use cases.

Whether you're creating training content, marketing videos, or internal communications, the key to successful Synthesia videos isn't just knowing which buttons to click. It's understanding how to plan, structure, and optimize your content for maximum impact.

Recent research from University College London found that Synthesia-generated videos are as effective as traditional presenter-led videos for adult learning outcomes. Over 50,000 businesses, including more than half of the Fortune 100, have created millions of videos using this process. I've seen organizations reduce video production time by up to 75% using this workflow.

🚀 Quick start: Synthesia video in 35 minutes

Define goal + audience + single CTA (5 minutes)
Write 45-60 second script using FOCA framework (10 minutes)
Select template + apply Brand Kit + choose expressive avatar (5 minutes)
Add one visual emphasis per scene + one interaction total (10 minutes)
Quick QA check → Generate → Share link for feedback (5 minutes)

Total time: 35 minutes from concept to shareable video.

Step 0: Plan your video strategy

Before touching Synthesia, spend 10-15 minutes on strategic planning. This upfront investment saves hours of rework later.

Define your video's purpose and audience

‍Are you creating compliance training for new hires? A product demo for prospects? An executive update for global teams? Each requires a different approach.

‍Training videos need clear learning objectives and knowledge checks. Product demos combine avatar narration with screen recordings. Internal communications prioritize consistency and quick updates over complex visuals.

Set success metrics upfront

‍Track completion rates (aim for 80%+), engagement points, or knowledge retention scores. For marketing videos, measure click-through rates and conversions. For training, track assessment scores before and after viewing.

Determine optimal length

‍Based on customer data: 45-90 seconds for explainers, 2-4 minutes for tutorials, 5-7 minutes for detailed training. Shorter videos get higher completion rates, but ensure you cover essential information.

Plan for localization needs

‍If you'll need multiple language versions, structure your content for easy translation from the start. Use simple, clear language that translates well across cultures. Avoid idioms and culturally specific references.

Step 1: Write a strategic video script

Your script determines 80% of your video's success. I recommend following the FOCA framework: Focus (hook), Outcome (what they'll learn), Content (main message), Action (clear CTA).

📝 FOCA framework for better scripts

Focus – Start with a hook to grab attention
Outcome – Clearly state what the viewer will learn
Content – Deliver your main message concisely
Action – End with a clear call-to-action (CTA)

Tip: Aim for 2–4 short sentences per scene, and keep your tone conversational for the best results.

Structure for success

‍Aim for 2-4 short sentences per scene, with 12-23 scenes total for optimal pacing. Start with a strong hook in your first 5 seconds—pose a question, share a surprising statistic, or address a pain point directly.

Common script mistakes I see

‍Long lists in narration (use on-screen text for scannable information instead), technical jargon without context, and missing clear CTAs.

Many users find Synthesia's AI-generated scripts helpful as starting points, but always refine them for accuracy and brand tone, especially for technical or financial content.

I recommend writing your script in a conversational tone, as if explaining to a colleague. Read it aloud before importing—if it sounds stiff spoken, it will sound worse with an AI voice.

Step 2: Select your starting point (template vs. file import vs. custom)

Once your script is ready, log in to Synthesia. You have three main paths to start your video, each suited to different needs.

Option 1: Start from a template

Click 'New video' in the top right corner, then browse the 55+ pre-designed templates. Templates work best when you need professional design quickly or want inspiration for layout and transitions.

You can design your own video template and save it for future use. Set up your Brand Kit first with colors, fonts, and logos, then create a master template. This feature is particularly useful for teams creating regular video series.

If using a template, add scenes by clicking the + button on the right side of the video canvas. Select scenes that match your content flow—don't feel obligated to use every available scene type.

Synthesia dashboard showing new video creation options

Option 2: Start from a file: PDF, Word (DOC, DOCX), PowerPoint (PPT, PPTX), and plain text (TXT)

With Synthesia’s AI video assistant, you can import files like slides, docs, scripts, or outlines and convert them into engaging videos easily.

The AI video assistant will parse the content, turn text into natural-sounding narration, and auto-generate on-brand scenes with pacing, transitions, and relevant visuals. You can then fine-tune scripts, timing, voices, avatars, and layout section-by-section, and instantly regenerate to produce a polished video without manual editing.

Option 3: Start from scratch

Choose a blank canvas when you need complete creative control or have specific brand requirements. While this takes longer, it gives you full flexibility over every element.

Adding scenes from templates in Synthesia

Step 3: Choose the right AI avatar for your content

Avatar selection impacts viewer engagement more than you might think. Click 'Avatar' above the video canvas to browse options.

Match avatar to content type

‍Expressive AI avatars work best for engaging presentations and marketing content—they use natural gestures and varied facial expressions. Professional avatars suit formal training content and compliance videos where authority matters more than entertainment.

Use framing strategically

‍Most avatars offer waist-up and chest-up options. Use waist-up for introductions and conclusions where full presence matters. Switch to chest-up for detailed explanations where facial expressions convey important information. Mix framings between scenes to add visual variety.

Click the avatar in your canvas to access layout options. You can display the full avatar, switch to a circle view for a modern look, or remove the avatar entirely for voiceover-only scenes.

Custom avatars

‍For executive communications or brand consistency, create a custom avatar of yourself or your spokesperson. Go to 'Avatars' → 'Create your own avatar'. Choose between a 5-minute web avatar or studio-quality custom avatar based on your needs and budget.

Step 4: Paste script and optimize voiceover

Copy your script and paste it into the script box scene by scene. Don't dump everything into one scene—this creates pacing issues and limits your editing options.

Synthesia automatically detects your script language and suggests a matching AI voice. But don't accept the default—test 2-3 voice options with your first scene. Click the voice selector in the top right corner of the script box to browse alternatives.

Voice selection strategy

‍Consider your audience's preferences and content tone. Professional voices work for formal training. Conversational voices suit marketing content. For global audiences, choose voices with neutral accents.

🔧 Fix mispronunciations easily

If your avatar mispronounces technical terms or brand names, use Synthesia's pronunciation guide feature:

Type the word phonetically to guide pronunciation
Or use the IPA (International Phonetic Alphabet) tool for precise control

Access this feature from the script editing panel for flawless narration every time.

Step 5: Build engaging visuals and interactions

This step transforms your video from a talking head into an engaging experience. Focus on visual hierarchy and purposeful animations rather than adding elements for decoration.

Establish visual hierarchy

Each scene needs a clear focal point. Start with your title, add supporting text below, then include one visual element that reinforces your message. Avoid cluttering scenes—if viewers don't know where to look, they'll stop watching.

Text: Click 'Text' above the canvas. Use Title for main points, Subtitle for supporting information, Paragraph for detailed explanations. Maintain consistent positioning across scenes.

Adjust text properties in the right panel. Keep fonts consistent—use no more than two font families throughout your video. Ensure sufficient contrast for accessibility.

Add supporting visuals strategically

Shapes: Click 'Shape' to add geometric elements. Use shapes to create visual containers for text or highlight important information. Adjust color, opacity, and shadows in the right panel.

Media: Click 'Media' to add images, videos, or icons. Search stock content or generate or upload your own. For software tutorials, combine avatar narration with screen recordings rather than static screenshots.

Screen recordings: Click 'Record' to capture your screen directly. Choose specific tabs, windows, or full screen based on your needs. Trim and loop recordings to match script timing perfectly.

Time animations with precision

Animations guide viewer attention when used purposefully. Click any element and scroll to 'Animation' in the right panel. Add enter animations to introduce key points. Use exit animations to clear space for new information.

You can synchronize animations with your script using trigger markers. Add a Marker in your script where you mention a key point, then set that marker as the animation trigger. This creates perfect audio-visual synchronization.

✨ Pro tip: synchronize animations with script

Use trigger markers in your script to perfectly sync key points with on-screen animations:

Add a Marker in your script where you mention a key idea
Set that marker as the animation trigger in the animation settings

This creates smooth, impactful audio-visual experiences for your viewers.

Add meaningful interactivity

Interactive elements boost engagement and knowledge retention. Use them sparingly but strategically—one or two purposeful interactions outperform constant clicking.

Add 'wait for click' pauses at decision points to let viewers control pacing. Create clickable hotspots for knowledge checks or branching scenarios. Note: only shapes and text can be interactive directly. To make images clickable, place a transparent shape over them and set opacity to 0%.

Scene transitions and music

Click a scene thumbnail on the left, then enable 'Scene transition' in the right panel. Choose transitions that match your content tone—professional content uses subtle fades, while marketing videos can handle dynamic transitions.

For background music, enable the 'Music' toggle in the scene settings. Choose from stock options or upload your own. Keep volume low—music should enhance, not compete with narration.

Use the 'Change all' feature in the color selector to update colors across all slides instantly. This maintains brand consistency without manual updates to every element.

Step 6: Quality check before generation

Before generating, run through this checklist. Five minutes of review saves regeneration time and credits.

‍Script review: One clear idea per scene? Strong hook in first 5 seconds? Explicit CTA near the end? Natural conversation flow?‍
Visual consistency: Consistent typography throughout? Sufficient contrast for readability? Cohesive color scheme? Aligned elements across scenes?‍
Timing check: Preview your video using the Play button. Do transitions feel natural? Does pacing match content complexity? Are animations synchronized with narration?‍
Accessibility: Clear fonts at readable sizes? High contrast between text and background? Captions enabled for hearing-impaired viewers?

Step 7: Generate and distribute your video

Click 'Generate' in the top right corner. Add a descriptive title and include automatic captions for accessibility and SEO benefits.

Generation typically takes 3-10 minutes depending on video length. You'll receive an email notification when complete.

Once generated, you have multiple distribution options. Download as MP4 for LMS upload or email attachment. Enable video sharing for a direct link—perfect for quick stakeholder reviews. Embed directly into your website or learning platform.

To share, enable 'Enable video sharing' and copy the link. You can also duplicate the video to create variations for different audiences or A/B testing.

🌍 Share your video your way

Download as MP4 for LMS or email use
Enable video sharing for a direct review link
Embed videos directly on your website or learning portal
Duplicate videos to create variations for A/B testing or different audiences

Choose the option that fits your workflow for fast, effective video distribution.

Scaling your video production

Once you've mastered single video creation, scale your production efficiently:

‍Create reusable templates: After perfecting a video format, save it as a template. Your team can then create consistent videos 10x faster. Set up Brand Kits first to ensure all videos maintain visual consistency.‍
Leverage bulk features: Use template variables for personalized video series. Need 50 onboarding videos with different names? Create one template with variables, then bulk generate. For enterprise needs, explore the API for programmatic video creation.‍
Establish collaboration workflows: Use Synthesia Spaces for team projects. Set up approval workflows so stakeholders review before final generation. Create different workspaces for different departments or video types.‍
Plan for localization: Structure content for easy translation from the start. Synthesia supports 140+ languages—take advantage of this for global reach. Create one master video, then generate versions in multiple languages efficiently.

Measuring success and iterating

Track these metrics to optimize future videos:

‍Completion rates: Aim for 80%+ for training videos, 60%+ for marketing content. If rates drop at specific points, that scene needs revision.‍
Engagement metrics: Monitor where viewers pause, replay, or drop off. Use this data to adjust pacing and content density.‍
Learning outcomes: For training videos, compare pre and post-assessment scores. Strong videos show measurable knowledge improvement.‍
Time to value: Track how quickly you can update videos versus traditional methods. Most users report 75% time savings—use this to justify expanded video programs.

One key advantage of Synthesia: easy content updates. When information changes, update the script and regenerate in 30 minutes rather than reshooting. This agility enables you to keep content current and relevant.

Ready to make Synthesia videos that deliver results?

You now have the complete framework for creating effective Synthesia videos. The key isn't perfection on your first attempt—it's starting with a clear plan and improving based on viewer feedback.

Remember: successful videos solve specific problems for specific audiences. Focus on your viewer's needs, keep your message clear, and let Synthesia handle the technical complexity.

Want to dive deeper into video strategy? Check out our FOCA video framework for advanced techniques. And if you're ready to start creating but don't have an account yet, explore our plans to find the right fit for your needs.

About the author

Strategic Advisor

Kevin Alster

Kevin Alster heads up the learning team at Synthesia. He is focused on building Synthesia Academy and helping people figure out how to use generative AI videos in enterprise. His journey in the tech industry is driven by a decade-long experience in the education sector and various roles where he uses emerging technology to augment communication and creativity through video. He has been developing enterprise and branded learning solutions in organizations such as General Assembly, The School of The New York Times, and Sotheby's Institute of Art.

Go to author's profile

Get started

Make videos with AI avatars in 140+ languages

Try out our AI Video Generator

Create a free AI video

Create free AI video

Unmute

Trusted by 50,000+ teams.

View all posts

Video ideas and resources

30 Best Video Templates With AI Presenters

Best video templates for all purposes. Eye-catching, professionally designed and fully editable. A real human presenter included. Well, almost human.

Artificial Intelligence

Why AI Video is the Perfect Fit for L&D in 2026

AI video helps L&D teams cut costs, update training in minutes, and boost learner engagement with interactive, multilingual content.

Video ideas and resources

10 Video-Making Tips for Beginners

Learn the top 10 essential tips for beginners to create professional-looking videos.

Unlisted

What is SCORM? Everything You Need to Know in 5 Minutes

SCORM is the universal standard that makes eLearning content compatible with any major Learning Management System (LMS). Here’s a quick guide to help you understand how it all works.

How to guides

How To Make A Startup Explainer Video On A Budget

A good startup explainer video can make or break your launch. Here’s how to create one without breaking the bank.

Video ideas and resources

10 Best Explainer Video Templates For Beginners

Looking for the best explainer video templates? Look no further! These easy-to-use templates are perfect for beginners.

faq

Frequently asked questions

How do I use Synthesia to create a video from start to finish?

Creating a video in Synthesia follows a simple workflow that takes about 35 minutes from concept to shareable video. Start by defining your goal, audience, and call-to-action, then write a 45-60 second script using the FOCA framework (Focus, Outcome, Content, Action). Next, log into Synthesia and choose whether to start from a template, import a file, or create from scratch. Select an AI avatar that matches your content tone, paste your script scene by scene, and add visual elements like text, shapes, or screen recordings to support your message.

Once your content is ready, run a quick quality check for script flow, visual consistency, and timing before clicking 'Generate' in the top right corner. The platform will process your video in 3-10 minutes, after which you can download it as an MP4, share via direct link, or embed it on your website. This streamlined process eliminates the need for cameras, microphones, or video editing skills while delivering professional results that engage your audience.

Can I import a PowerPoint, PDF, or Word file to turn it into a Synthesia video?

Yes, Synthesia's AI video assistant can transform your existing PowerPoint presentations, PDFs, Word documents, and plain text files directly into engaging videos. Simply click 'New video' and select the file import option, then upload your document. The AI assistant automatically parses your content, converts text into natural-sounding narration, and generates on-brand scenes with appropriate pacing, transitions, and relevant visuals based on your content.

After the initial generation, you have full control to fine-tune every aspect of your video. Adjust scripts, timing, voices, avatars, and layout for each scene individually, then regenerate instantly to see your changes. This feature is particularly valuable for teams who already have training materials, presentations, or documentation they want to transform into more engaging video content without starting from scratch.

How do I choose the right AI avatar for my video, and can I create a custom avatar of myself?

Selecting the right avatar depends on your content type and audience expectations. For engaging presentations and marketing content, choose expressive AI avatars that use natural gestures and varied facial expressions to maintain viewer interest. For formal training or compliance videos where authority matters more than entertainment, professional avatars work best. You can also vary avatar framing between waist-up for introductions and chest-up for detailed explanations to add visual variety and emphasize different types of content.

If you need consistent brand representation or want executives to deliver messages personally, you can create custom avatars. Navigate to 'Avatars' then 'Create your own avatar' to choose between a 5-minute web avatar for quick needs or a studio-quality custom avatar for premium results. Custom avatars are particularly valuable for executive communications, brand consistency across video series, or when you need the same spokesperson to deliver regular updates without scheduling repeated recording sessions.

Does Synthesia support multiple languages, and how should I plan for localization?

Synthesia supports over 140 languages with AI voices that automatically match your script language, making it ideal for global teams and international audiences. When planning for localization, structure your content from the start with translation in mind by using simple, clear language that translates well across cultures and avoiding idioms or culturally specific references. The platform automatically detects your script language and suggests appropriate voices, though you should test 2-3 voice options to find the best match for your audience's preferences.

To create multilingual versions efficiently, develop one master video with your primary language, then duplicate it and replace the script with translations. This approach maintains consistent visuals and timing while allowing you to generate versions in multiple languages within minutes rather than hours. Many organizations use this capability to ensure training materials, product updates, and company communications reach their entire global workforce in their preferred language, significantly improving engagement and comprehension.

Can I try Synthesia for free before choosing a plan?

Yes, Synthesia offers a free AI video generator that lets you create and experience the platform before committing to a paid plan. You can access this by clicking 'Create free AI video' on the website, which allows you to test core features like avatar selection, script input, and basic video generation without providing credit card information. This gives you hands-on experience with the interface and helps you understand how Synthesia can meet your specific video creation needs.

The free trial is particularly useful for evaluating video quality, testing different avatars and voices, and understanding the workflow before making a purchase decision. You can create a complete video to share with stakeholders and get buy-in for larger video initiatives. Once you're ready to scale your video production with additional features like custom avatars, advanced templates, and team collaboration tools, you can explore the various pricing plans that match your organization's needs and video volume requirements.

Ready to try our AI video platform?

Join over 1M+ users today and start making AI videos with 230+ avatars in 140+ languages.

How to Make Synthesia Videos: A Step-by-Step Guide

Step 0: Plan your video strategy

Define your video's purpose and audience

Set success metrics upfront

Determine optimal length

Plan for localization needs

Step 1: Write a strategic video script

Structure for success

Common script mistakes I see

Step 2: Select your starting point (template vs. file import vs. custom)

Option 1: Start from a template

Option 2: Start from a file: PDF, Word (DOC, DOCX), PowerPoint (PPT, PPTX), and plain text (TXT)

Option 3: Start from scratch

Step 3: Choose the right AI avatar for your content

Match avatar to content type

Use framing strategically

Custom avatars

Step 4: Paste script and optimize voiceover

Voice selection strategy

Step 5: Build engaging visuals and interactions

Establish visual hierarchy

Add supporting visuals strategically

Time animations with precision

Add meaningful interactivity

Scene transitions and music

Step 6: Quality check before generation

Step 7: Generate and distribute your video

Scaling your video production

Measuring success and iterating

Ready to make Synthesia videos that deliver results?

You might also like

30 Best Video Templates With AI Presenters

Why AI Video is the Perfect Fit for L&D in 2026

10 Video-Making Tips for Beginners

What is SCORM? Everything You Need to Know in 5 Minutes

How To Make A Startup Explainer Video On A Budget

10 Best Explainer Video Templates For Beginners

Frequently asked questions

How do I use Synthesia to create a video from start to finish?

Can I import a PowerPoint, PDF, or Word file to turn it into a Synthesia video?

How do I choose the right AI avatar for my video, and can I create a custom avatar of myself?

Does Synthesia support multiple languages, and how should I plan for localization?

Can I try Synthesia for free before choosing a plan?

Ready to try our AI video platform?