Remotion Internal Architecture — The Pipeline from Frames to Video
Dissecting the rendering pipeline: Headless Chrome → screenshots → FFmpeg encoding
How Remotion converts React code to video is a creative use of web technologies.
First, Webpack or esbuild builds React components into a browser-executable bundle. This bundle includes video metadata (fps, resolution, total frame count).
Next, Puppeteer launches Headless Chrome, loads the bundled page, and sequentially changes the frame number from 0 to the last frame, screenshotting each as PNG/JPEG. The delayRender()/continueRender() API can delay screenshots until async data loading (API calls, font loading etc.) completes.
Finally, FFmpeg takes the image sequence and encodes it to H.264 (MP4) or VP8/VP9 (WebM). If there's an audio track, it's muxed together.
Multi-threaded rendering is also supported. The --concurrency option runs multiple Chrome instances simultaneously to distribute frame processing.
How It Works
Webpack/esbuild bundles React components + metadata (fps, width, height, durationInFrames)
Puppeteer launches Headless Chrome instance and loads the bundled HTML page
Iterate from frame 0 to N, inject currentFrame → React re-render → screenshot (PNG/JPEG)
If delayRender() is active, wait for continueRender() before screenshotting (async safety)
FFmpeg encodes image sequence + audio track to H.264/VP9 for final MP4/WebM output
Pros
- ✓ Uses only web standard tech: proven combination of Chrome + FFmpeg
- ✓ Everything the browser can render (CSS/SVG/Canvas/WebGL) becomes a video frame
- ✓ delayRender guarantees async data completion → prevents blank frames in data-driven videos
Cons
- ✗ Chrome dependency: cannot render in environments without Headless Chrome
- ✗ Memory usage: holding full HD frames in memory means high RAM consumption for long videos
- ✗ Complex debugging: hard to pinpoint whether rendering errors occur in Puppeteer/Chrome/FFmpeg