Fully Autonomous AI Animation Studio (Solving Character Consistency & FFmpeg Jitter)

Hey everyone,

I recently built a completely zero-touch, autonomous media pipeline inside n8n and wanted to share the architecture with the community.
The goal was to build a faceless AI YouTube channel that actually maintains 100% character visual consistency (which is usually the biggest failure point in AI video generation).

:gear: The Stack:

  • n8n (Core orchestration)
  • Telegram Bot (Trigger & Final Delivery)
  • Gemini 3.1 Multimodal (Director & Image Translation)
  • OpenRouter (Unified API billing for generation & TTS)
  • Python / FFmpeg (Video Compiler)

:brain: How the “Brain” works:
Instead of using text prompts to generate characters (which always hallucinates), the n8n Agent receives a base static image of the character. It forces Gemini 3.1 to act strictly as an image-translator, altering only the facial expressions and environment based on scraped news data, leaving the core 3D geometry untouched.

:flexed_biceps: The “Muscle” (Bypassing FFmpeg Jitter):
If you’ve automated video with FFmpeg, you know the zoompan filter creates a terrible, jittery sub-pixel mess. I built a Python node to execute an Oversampling bypass: It scales the raw generated image up to 4K, runs the camera movement on the massive pixel density, and downscales it back to 1080p. The result is buttery smooth.

:television: I recorded a full teardown of the nodes and the final rendered output here:(https://www.youtube.com/watch?v=qRJN9VVqy0g)

:package: Here is the raw AI Director. If you want the complete, connected workspace with all the sub-agents and routing, I’ve put the full JSON file here for $0:
( Project Stickman: Autonomous AI Studio (n8n Workflow) )

Let me know if you have any questions about the data synthesizer agent or the FFmpeg logic. Would love to hear how you guys handle complex media rendering in n8n!

1 Like