How much energy does it take to make a corporate video with AI?


Turn your texts, PPTs, PDFs or URLs to video - in minutes.

When people think of generative AI, energy efficiency is not the first tagline that comes to mind. While the rising energy consumption of LLMs has given the industry a bad reputation when it comes to sustainability, we wanted to see if the data told a different story for AI generated video.
Our findings presented below show that AI video emits much less carbon than producing a video the traditional way, with cameras, lights, microphones and all the other conventional requirements associated with content made in a studio or on location. Recent advances in video generation such as diffusion have transformed the production of a video into an entirely digital process that produces similar results to recordings of the physical world filmed on a camera. As a consequence, the carbon-intensive physical shoots, which include travel to locations or setting up a studio, alongside the usage of energy-hungry recording and post-processing equipment, can be swapped for energy-efficient software running in data centers that are powered mostly by renewable energy.
Therefore, for companies that aim to reduce their carbon footprint and communicate more effectively with their employees or customers using video, generative AI now provides a way to achieve both.
To show how this works in practice, we have analyzed Synthesia’s carbon footprint - including available scope 1, 2 and 3 emissions data - and are releasing a benchmark to help companies compare the emissions related to making a video with AI and using traditional means of production.
The energy demands of AI models
Over the past year, public perception about the environmental impact of generative AI has really soured. One of the most quoted statistics came from a Goldman Sachs report which found that a ChatGPT prompt consumes 10x more energy than a Google search.
The alarm bells, rightly so, were rung among business leaders. With bullish sustainability targets high on the board agendas, companies became concerned with generative AI’s energy demands and related carbon footprint. The consensus narrative quickly became: all generative AI is bad for the environment.
That narrative, however, lacks nuance. While data from the International Energy Agency did suggest that LLM queries consume much more energy than an average Google search, this assumption should not be generalised to formats beyond generative text.
When assessing the environmental impact of other AI-generated formats, we need to consider the emissions impact of the alternative means of production being replaced. In the case of AI video, we can show there is a clear environmental benefit compared to a traditional video production process.
Carbon-intensive physical production
To fully understand generative AI’s footprint-reducing potential for video, let’s first look at the status quo: video production in the real world. Between lighting, catering, sets, location scouting and travel, these productions consume enormous resources.
According to a report from the Sustainable Production Alliance, a single big budget movie emits 3370 metric tons of CO2 (MTCO2e), about the same as driving 8,581,906 miles with a gasoline-powered car.
And there’s many more than one produced each year. In London alone, the screen production industry produces approximately the same amount of emissions as 24,000 London homes in a year.
Generating a minute of AI video: less than boiling a kettle of water?
In contrast, AI video generation happens in the cloud; there are no big filming sets, no global flights for location shoots, no weeks of on-location electricity usage. By generating video with computers rather than relying on physical shoots, videos generated with Synthesia have a much smaller footprint.
By analysing our available emissions data from the cloud provider that Synthesia uses to host its platform and the travel management software we use to book corporate travel, we were able to compile our total emissions for the past two years:
- Total 2023 emissions: 0.7 MTCO2e
- Total 2024 emissions: 0.759 MTCO2e
In these two years, Synthesia was used to create millions of videos, adding up to a total of 136,120 hours of video generated. Using these data points, we were then able to estimate the average carbon emitted per minute of video generated.
For a sense of scale, let’s compare with the most British household activity of them all: boiling a kettle of water to make tea:
- Generating the energy required to boil a kettle results in around 0.05 kg of CO2e
- Generating a minute of video using Synthesia averages around 0.00025 kg of CO2e
This makes video generation in Synthesia about 200 times more carbon efficient than putting the kettle on, and - as we’ll see below - 160,000 times more carbon efficient compared to filming a video the traditional way.
Comparing AI video and traditional production
To help businesses estimate the carbon impact of their video needs, we have compiled industry benchmarks on traditional production into the following three categories:
- High budget: complex, large scale productions, such as a Superbowl ad or a Hollywood movie. Based on industry benchmarks for entertainment and drama productions, we estimate average emissions of 30.5 MTCO2e/hour
- Medium budget: complex productions that are less time to complete, for example a product marketing video. We estimate average emissions of 13.1 MTCO2e/hour
- Low budget: productions that are less resource intensive, such as a compliance training video. We estimate average emissions of 2.4 MTCO2e/hour which means about 40 kg of CO2e per minute.
For each category, we assumed the most conservative industry benchmark available for the types of productions that are comparable in terms of resource intensity. For example, for low budget productions we used the lowest available benchmark from genres such as short documentaries.
Assuming the low budget emissions benchmark, if Synthesia clients would have used traditional means to produce the videos generated in 2024, it would have led to an additional 215,712 metric tons of CO2 being released into the atmosphere. That’s about the same as saving the emissions of 42,086 UK homes in 2024 alone.
It’s important to mention however that traditional production isn’t sitting still. From LED lighting to virtual production stages, the industry is finding ways to reduce its impact. But even with these innovations, physical productions will struggle to match the carbon profile of AI-generated video.
Towards more inclusive and sustainable video production
AI video doesn’t just save carbon; it also lowers barriers of entry, allowing marketers, educators, and organisations to communicate their ideas with video. And while no technology is entirely carbon-free, the trajectory is clear: as cloud providers continue to invest in efficiency, circular economy practices and low-carbon energy, the emissions gap between AI video and traditional production will likely grow.
Appendix: key benchmarks on production emissions
- Emissions from a single 1hr scripted TV episode: 77 MTCO2e (Sustainable Production Alliance 2021 report)
- Industry headline average emissions: 16.6 tCO2e/hr (albert 2023 Annual Review)
- Average emissions from news productions: 3.27 tCO2e/hr (albert 2023 Annual Review)
- Average emissions from drama productions: 48.7 tCO2e/hr (albert 2023 Annual Review)
- Emissions from a documentary film: 2.4 tCOe/hr (Oeko Institute, 2022)
About the author
Strategy Partner
Daniel Verten
Meet Daniel Verten, Strategy Partner at Synthesia, the largest AI video platform for the enterprise.