●2026·Gen AI·Pipeline architectureTessact AI

A multi-stage avatar pipeline, behind a dedicated service.

Inputs in, reusable avatar identity and finished videos out. The hard part was not calling any single model. It was coordinating a long running, multi provider media pipeline where each stage has different latency, cost, failure behavior, and review semantics.

Role: Engineer · pipeline + service boundary
Team: Gen AI service
Stack: FastAPI, Python, Postgres, RabbitMQ
Surface: Internal job APIs, callbacks
Status: Shipping
Sanitized: Yes

·Demo

01Overview

A small set of inputs becomes a reusable avatar identity and generated video output.

The workflow validates input media, builds a consistent character identity, generates speech, plans visuals, renders scene assets, aligns audio and video, and produces a stitched final clip. I designed it around explicit contracts, durable stage state, resumable execution, and provider adapters, so the product could support creative iteration without coupling the core backend to AI execution.

Workflow capabilities9

Avatar creation
Voice cloning
Script driven generation
Storyboard planning
Background generation
Pose & b-roll branches
Lip sync
Stitching
Final enhancement

02Product problem

Avatar video combines several expensive, failure prone steps.

Validating media, building character identity, synthesizing speech, planning visuals, rendering scene assets, aligning audio and video, stitching final output. Originally much of this lived inside the product backend.

Before

AI execution lived inside the product backend.

· Pipeline iteration coupled to product release.
· Provider swaps touched product models.
· Retries were one off background tasks.
· Failure recovery was the whole pipeline.

After

A dedicated Gen AI service owns execution.

· Product backend submits validated snapshots.
· Service runs the media pipeline.
· No imports of product models, no direct DB writes.
· Failure localized to a single stage.

03My role

01·

Versionable job contracts

For avatar creation, output generation, retries, approvals, and voice cloning.

02·

Pipeline as persisted state

Stages modeled as state transitions instead of one off background tasks.

03·

Provider adapters

Image, video, audio, storyboard, background, b-roll, lip sync, and enhancement behind capability interfaces.

04·

Lifecycle events

Structured accepted, started, progress, completed, and failed events back to the product backend.

05·

Resumable execution

Targeted resume from a named failed or user approved stage, without re running upstream work.

06·

Contract & pipeline tests

Around routing, idempotency, persistence, callback delivery, and output shape.

04System design

A standalone service, durable state, and queue backed workers.

Requests enter through internal job APIs, normalize into payload snapshots, persist with idempotency guarantees, and dispatch to background workers. The product backend stays the source of truth for users, workspaces, and permissions. The Gen AI service operates on immutable snapshots.

Source of truthProduct backend

Owns orchestrationGen AI service

BoundaryInternal job contracts

Service boundaryinternal only

Service AAPI

Product backend

Users & workspaces
Permissions
Public API
Job records
Response shapes

Service BThis service

Gen AI service

Job APIs (idempotent)
Persistence + snapshots
Worker queues
Pipeline modules
Provider adapters

Job contractValidated snapshot, idempotency key, correlation metadata.

Lifecycle eventsaccepted · started · progress · completed · failed.

Inside Service Bworkers · adapters

Avatar jobs

Marketing

Callbacks

Maintenance

Retry

05Avatar creation flow

The identity is an asset graph, not a single file.

Later outputs reference a stable neutral image, a character sheet, and an optional voice profile. Each stage records its inputs and outputs, making retries deterministic for operations while still allowing creative regeneration where appropriate.

01
ValidateUser-provided input images checked.
02
Neutral imageConsistent avatar baseline generated.
03
Character sheetVisual consistency reference built.
04
PersistAssets written through storage adapter.
05
EmitProgress and completion to product backend.

targeted retryA failed neutral image stage can rerun without repeating validation. A character sheet can regenerate independently when the upstream avatar identity is already available.

06Avatar output flow

A branching DAG with shared stage primitives.

Single scene, multi scene, b-roll, pose driven, and approval driven paths share common stages but route differently based on payload flags and existing approved assets. Each stage exposes useful state to the frontend, like generating audio, building storyboard, or stitching video, and gives support a precise stage to retry from.

Five routes share these primitives:

single scene
multi scene
b-roll
pose based
approval gated

Output pipelinecritical joinstage

Payload

snapshot in

Script context

Storyboard plan

Speech + timing

Backgrounds

Scene assets

B-roll / pose

Stitch

Lip sync

Enhance

Output

media + scenes

stages11

branches5

resume pointsany failed stage

07Technical highlights

contractv3

job_type:"avatar_output"

idempotency_key:"af3c…"

avatar_id:"av_817"

script:"…"

mode:"multi_scene"

approved:{ bg, story }

retry:null

Contract first execution

Jobs carry strict payload snapshots: avatar identity, script, output settings, approved assets, storyboard context, retry context.

validateok×1

neutral_imageok×1

character_sheetrunning×1

persistpending×0

emitpending×0

Stage level observability

Each stage persists input, output, status, attempts, errors. Audit trail without exposing provider internals.

Branch aware orchestration

The same output API routes to script generation, multi scene storyboarding, background regeneration, or approval flows.

orchestration

imagevideospeechlip-syncstoryboardenhance

provider adapter interface

Provider portability

Orchestration calls capability oriented adapters. Model and provider changes happen behind stable interfaces.

0:00scenes · 50:100

Media timeline handling

Storyboard scenes carry timing, transcript, pose, and visual metadata so downstream assembly can reason about continuity.

snapshotfrozen

id: job_47a1c…avatar: av_817@v2script: sha:91fe…approved: bg, story

Snapshot payloads

Jobs run from captured product state. The service does not need access to the product DB to execute generation.

08Retry & approval

Long media pipelines need human review and selective regeneration.

Each retry carries context describing where to resume and whether the system should rerun the downstream pipeline or only regenerate a single step. This avoids repeating expensive work and preserves approved assets.

7 job types

01Full output generationprimary
02Retry from stageretry / approval
03Script approvalretry / approval
04Background regenerationretry / approval
05Background approvalretry / approval
06Storyboard approvalretry / approval
07Video asset approvalretry / approval

09Reliability patterns

Idempotent job creation

Repeated submissions with the same idempotency key map to the same job.

Durable step records

Every major stage has persisted input, output, status, and error state.

Structured outbound events

Callbacks stored before delivery, allowing delivery retry and audit.

Dedicated queues

Avatar jobs separated from marketing, callback, and maintenance work.

Provider abstraction

Provider request details isolated from orchestration code.

Snapshot payloads

Jobs run from captured product state, reducing coupling to product DB.

Failure localization

Failed jobs retain the failed stage and structured error payload.

10Privacy & security

Sanitized for portfolio

Private internal APIs and tokenized service calls.

Secrets, provider credentials, storage locations, and account specific configuration come from environment configuration and are not represented here. User media and generated outputs are referenced through controlled asset URLs in job snapshots and event payloads. The service does not need direct access to product user credentials or the product database to execute the pipeline.

11Outcome

A cleaner architecture for long running avatar media generation.

AI execution moved behind a dedicated service boundary. Product logic stayed in the core backend, while the Gen AI service gained ownership over orchestration, provider integrations, retries, and pipeline observability. Resumable, observable, provider portable, and suited to human in the loop creative review.

ResumableTargeted resume from any failed or approved stage.

ObservablePer stage status, attempts, and structured errors.

PortableCapability adapters keep providers swappable.

ReviewableApproval gates without backend coupling.

All case studies Have a question about this? Write to me →

Case study · 01 of 04 · ai-avatars