2026·Gen AI·Pipeline architectureTessact AI

A multi-stage avatar pipeline, behind a dedicated service.

Inputs in, reusable avatar identity and finished videos out. The hard part was not calling any single model. It was coordinating a long running, multi provider media pipeline where each stage has different latency, cost, failure behavior, and review semantics.

Role
Engineer · pipeline + service boundary
Team
Gen AI service
Stack
FastAPI, Python, Postgres, RabbitMQ
Surface
Internal job APIs, callbacks
Status
Shipping
Sanitized
Yes
·Demo
01Overview

A small set of inputs becomes a reusable avatar identity and generated video output.

The workflow validates input media, builds a consistent character identity, generates speech, plans visuals, renders scene assets, aligns audio and video, and produces a stitched final clip. I designed it around explicit contracts, durable stage state, resumable execution, and provider adapters, so the product could support creative iteration without coupling the core backend to AI execution.

Workflow capabilities9
  • Avatar creation
  • Voice cloning
  • Script driven generation
  • Storyboard planning
  • Background generation
  • Pose & b-roll branches
  • Lip sync
  • Stitching
  • Final enhancement
02Product problem

Avatar video combines several expensive, failure prone steps.

Validating media, building character identity, synthesizing speech, planning visuals, rendering scene assets, aligning audio and video, stitching final output. Originally much of this lived inside the product backend.

Before

AI execution lived inside the product backend.

  • · Pipeline iteration coupled to product release.
  • · Provider swaps touched product models.
  • · Retries were one off background tasks.
  • · Failure recovery was the whole pipeline.
After

A dedicated Gen AI service owns execution.

  • · Product backend submits validated snapshots.
  • · Service runs the media pipeline.
  • · No imports of product models, no direct DB writes.
  • · Failure localized to a single stage.
03My role
01·

Versionable job contracts

For avatar creation, output generation, retries, approvals, and voice cloning.

02·

Pipeline as persisted state

Stages modeled as state transitions instead of one off background tasks.

03·

Provider adapters

Image, video, audio, storyboard, background, b-roll, lip sync, and enhancement behind capability interfaces.

04·

Lifecycle events

Structured accepted, started, progress, completed, and failed events back to the product backend.

05·

Resumable execution

Targeted resume from a named failed or user approved stage, without re running upstream work.

06·

Contract & pipeline tests

Around routing, idempotency, persistence, callback delivery, and output shape.

04System design

A standalone service, durable state, and queue backed workers.

Requests enter through internal job APIs, normalize into payload snapshots, persist with idempotency guarantees, and dispatch to background workers. The product backend stays the source of truth for users, workspaces, and permissions. The Gen AI service operates on immutable snapshots.

Source of truthProduct backend
Owns orchestrationGen AI service
BoundaryInternal job contracts
Service boundaryinternal only
Service AAPI

Product backend

  • Users & workspaces
  • Permissions
  • Public API
  • Job records
  • Response shapes
Service BThis service

Gen AI service

  • Job APIs (idempotent)
  • Persistence + snapshots
  • Worker queues
  • Pipeline modules
  • Provider adapters
Job contractValidated snapshot, idempotency key, correlation metadata.
Lifecycle eventsaccepted · started · progress · completed · failed.
Inside Service Bworkers · adapters
Avatar jobs
Marketing
Callbacks
Maintenance
Retry
05Avatar creation flow

The identity is an asset graph, not a single file.

Later outputs reference a stable neutral image, a character sheet, and an optional voice profile. Each stage records its inputs and outputs, making retries deterministic for operations while still allowing creative regeneration where appropriate.

  1. 01
    ValidateUser-provided input images checked.
  2. 02
    Neutral imageConsistent avatar baseline generated.
  3. 03
    Character sheetVisual consistency reference built.
  4. 04
    PersistAssets written through storage adapter.
  5. 05
    EmitProgress and completion to product backend.
targeted retryA failed neutral image stage can rerun without repeating validation. A character sheet can regenerate independently when the upstream avatar identity is already available.
06Avatar output flow

A branching DAG with shared stage primitives.

Single scene, multi scene, b-roll, pose driven, and approval driven paths share common stages but route differently based on payload flags and existing approved assets. Each stage exposes useful state to the frontend, like generating audio, building storyboard, or stitching video, and gives support a precise stage to retry from.

Five routes share these primitives:

  • single scene
  • multi scene
  • b-roll
  • pose based
  • approval gated
Output pipelinecritical joinstage
Payload
snapshot in
Script context
Storyboard plan
Speech + timing
Backgrounds
Scene assets
B-roll / pose
Stitch
Lip sync
Enhance
Output
media + scenes
stages11
branches5
resume pointsany failed stage
07Technical highlights
contractv3
job_type:"avatar_output"
idempotency_key:"af3c…"
avatar_id:"av_817"
script:"…"
mode:"multi_scene"
approved:{ bg, story }
retry:null

Contract first execution

Jobs carry strict payload snapshots: avatar identity, script, output settings, approved assets, storyboard context, retry context.

validateok×1
neutral_imageok×1
character_sheetrunning×1
persistpending×0
emitpending×0

Stage level observability

Each stage persists input, output, status, attempts, errors. Audit trail without exposing provider internals.

Branch aware orchestration

The same output API routes to script generation, multi scene storyboarding, background regeneration, or approval flows.

orchestration
imagevideospeechlip-syncstoryboardenhance
provider adapter interface

Provider portability

Orchestration calls capability oriented adapters. Model and provider changes happen behind stable interfaces.

s1
s2
s3
s4
s5
0:00scenes · 50:100

Media timeline handling

Storyboard scenes carry timing, transcript, pose, and visual metadata so downstream assembly can reason about continuity.

snapshotfrozen
id: job_47a1c…avatar: av_817@v2script: sha:91fe…approved: bg, story

Snapshot payloads

Jobs run from captured product state. The service does not need access to the product DB to execute generation.

08Retry & approval

Long media pipelines need human review and selective regeneration.

Each retry carries context describing where to resume and whether the system should rerun the downstream pipeline or only regenerate a single step. This avoids repeating expensive work and preserves approved assets.

7 job types
  • 01Full output generation
  • 02Retry from stage
  • 03Script approval
  • 04Background regeneration
  • 05Background approval
  • 06Storyboard approval
  • 07Video asset approval
09Reliability patterns
K
01

Idempotent job creation

Repeated submissions with the same idempotency key map to the same job.

S
02

Durable step records

Every major stage has persisted input, output, status, and error state.

E
03

Structured outbound events

Callbacks stored before delivery, allowing delivery retry and audit.

Q
04

Dedicated queues

Avatar jobs separated from marketing, callback, and maintenance work.

P
05

Provider abstraction

Provider request details isolated from orchestration code.

F
06

Snapshot payloads

Jobs run from captured product state, reducing coupling to product DB.

X
07

Failure localization

Failed jobs retain the failed stage and structured error payload.

10Privacy & security
Sanitized for portfolio

Private internal APIs and tokenized service calls.

Secrets, provider credentials, storage locations, and account specific configuration come from environment configuration and are not represented here. User media and generated outputs are referenced through controlled asset URLs in job snapshots and event payloads. The service does not need direct access to product user credentials or the product database to execute the pipeline.

11Outcome

A cleaner architecture for long running avatar media generation.

AI execution moved behind a dedicated service boundary. Product logic stayed in the core backend, while the Gen AI service gained ownership over orchestration, provider integrations, retries, and pipeline observability. Resumable, observable, provider portable, and suited to human in the loop creative review.

ResumableTargeted resume from any failed or approved stage.
ObservablePer stage status, attempts, and structured errors.
PortableCapability adapters keep providers swappable.
ReviewableApproval gates without backend coupling.
Case study · 01 of 04 · ai-avatars