Skip to content

ASMI · For Developers

Build avatars once. Run them anywhere.

The full technical tour of the Avatar State Machine Interface — from the wizard, through the editor and MCP server, to the open-source runtime that ships the avatar into any host site.

Walkthrough

End-to-end build, in two minutes

Click through the interactive tour — every screen of the ASMI build, from setup wizard to deployed widget.

ASMI for Developers

ASMI — Avatar State Machine Interface — is broen.tech's design-time studio for building conversational avatars with real expressions, real state machines, and two npm packages you embed directly into your own stack: a React drop-in for the ~95% of host sites that are React apps, and a framework-agnostic runtime for everything else. This page is the end-to-end technical tour: what each piece is, where it lives, and how they fit together from the first click in the wizard to a live avatar running on someone else's website.

If you are a coding AI reading this through an MCP client, you probably want the reference implementation guide at /skills/asmi-implementation.md instead — it carries the single-component <AsmiAvatar> mount recipe + per-client MCP config snippets. This page is the broader tour for human developers evaluating or integrating ASMI.

ASMI in one paragraph

ASMI is a design-time tool, not a runtime dependency. You design an avatar in the editor at /apps/asmi, deploy it, then an implementing coding AI (Claude Code, Lovable, Cursor, Windsurf…) pulls the avatar's definition through our MCP server and generates a complete implementation directly into the target website's own codebase. That site then runs the avatar on its own LLM provider key — OpenAI, Anthropic, Gemini, whatever it already uses. Nothing routes back through broen.tech at runtime. If we ever took broen.tech offline, live implementations would keep working against the snapshot they were built from.

The avatar lifecycle

text
 ┌──────────┐   ┌────────┐   ┌────────┐   ┌──────────┐   ┌─────┐   ┌──────────────┐
 │  Wizard  │──▶│ Editor │──▶│ Deploy │──▶│ Gallery? │──▶│ MCP │──▶│ Host site    │
 │ persona  │   │ canvas │   │ api key│   │  share   │   │ tool│   │ (BYO LLM key)│
 │ face     │   │ state  │   │ MCP    │   │ license  │   │ list│   └──────────────┘
 │ voice    │   │ machine│   │ enable │   └──────────┘   │ get │
 └──────────┘   │ narrat.│   └────────┘                  │ etc │
                └────────┘                               └─────┘

Each stage in the sections below maps to a concrete file path in the broen.tech repo.

Creating an avatar — the wizard

A new avatar starts in a four-step wizard that collects just enough to scaffold a working state machine. Every field is editable afterwards in the editor — the wizard is seed, not cage.

StepWhat it capturesFile
Template choiceBlank vs. persona preset vs. clone from gallerysrc/components/asmi/wizard/TemplateChoiceStep.tsx
Company contextName, description, contact email, booking link, optional site crawlsrc/components/asmi/wizard/CompanyContextStep.tsx
Brand voice8 sliders (formality, warmth, detail, technical depth, playfulness, expressiveness, creativity, humor style) + persona pickersrc/components/asmi/wizard/BrandVoiceStep.tsx
AwarenessTemporal, business hours, cultural calendarsrc/components/asmi/wizard/AwarenessStep.tsx
AppearanceIllustrated / photorealistic / minimal; base visual descriptionsrc/components/asmi/wizard/AvatarAppearanceStep.tsx
Expression wizardPer-expression generation (neutral, smiling, thinking…) + face editor (gender, hair, eyes, skin, clothing, accessories)src/components/asmi/wizard/ExpressionWizard.tsx

Orchestration lives in src/components/asmi/wizard/SetupWizard.tsx. The wizard produces an AvatarDefinition JSON object seeded with sensible defaults for the state machine — the user can immediately send test messages from the editor without touching a single node.

The editor

AsmiEditor.tsx is the top-level shell; AsmiEditorContext.tsx is the reducer-backed state container. The editor surfaces six panels:

  • Canvas (CanvasPanel.tsx) — visual state machine with parallel regions for conversation and expression. Nodes are states, transitions carry guards that branch on intent / sentiment / confidence.
  • Properties (PropertiesPanel.tsx + StateProperties.tsx / TransitionProperties.tsx) — edit selected node/edge fields.
  • Narrative (NarrativePanel.tsx) — record a canonical happy-path scenario: user messages + avatar responses + expression transitions. The recording is later frozen on the gallery share and used by the MCP verify_deployment tool as ground truth.
  • Brand voice editor (panels/BrandVoiceEditor.tsx) — tweak the 8 sliders post-wizard. Runs through brand-voice-compiler.ts to produce the final system prompt.
  • JSON preview (panels/JsonPreviewPanel.tsx) — inspect the raw AvatarDefinition.
  • Deployment modal — generates / reveals / regenerates the user's MCP API key + prints the full briefing markdown for pasting into any coding AI.

Generating expressions

Expression images ship on transparent PNGs (we green-screen the generation and key it out) so they blend into any host surface. They drive the face is the product value proposition — a static chatbot icon defeats the point of ASMI.

One route, src/app/api/asmi/generate-expression/route.ts, handles three modes:

  1. Base mode — text → first frame of an expression.
  2. Animation frame mode — take a reference frame + an animationHint ("eyes quarter-closed", "head tilted 5° left") → produce a second frame with that change applied, everything else pixel-perfect identical.
  3. Blend (mid-frame) mode — given two frames (base + peak), generate the exact visual midpoint for a smooth interpolation.

Model: gemini-2.5-flash-image (nano-banana). Rate-limited to 10 generations per minute per user. Billing pipeline is tagged generate_expression / generate_expression_frame.

The AvatarDefinition shape

src/types/asmi.ts is the authoritative type. Top-level fields:

  • version, id, name, description
  • avatar — appearance + expressions + optional face-editor snapshot
  • llm — provider, model, systemPrompt, temperature, maxTokens
  • stateMachine — XState parallel regions
  • actions — llm_generate, llm_classify, emit_expression, emit_app_event
  • brandVoice — sliders + persona + compiled systemPrompt
  • companyContext — name, description, contact, booking, crawled pages
  • proactiveTriggers — dwell / exit-intent / return-visit configuration
  • awareness — temporal / business-hours / cultural
  • license — SPDX id (defaults to MIT on gallery publish)
  • derivedFromShareId — non-null means this avatar was cloned from a public gallery share and cannot be re-shared (anti-laundering guard)

The state machine runtime

src/lib/asmi/runtime/state-machine-runner.ts is the in-repo reference implementation of processMessage. The published npm package at @avatar-state-machine-interface/runtime is the same logic repackaged for host-site consumption — same function names, same behavior.

One turn through processMessage goes:

  1. Classify — intent + sentiment + confidence via llm_classify. Tracked in a ClassificationResult struct.
  2. Resolve awareness — visitor timezone → temporal/business-hours/cultural context injected into the enriched session context.
  3. Transition — evaluate the current state's outbound transitions; guards branch on the classification result.
  4. LLM generate — for llm_generate actions, invoke the host's LlmProvider with the compiled systemPrompt + responseRules + conversation history.
  5. Emit — expression changes + outbound app events (handoff, satisfaction_pulse, offered_contact, …) returned to the caller.

The public LlmProvider interface is dead-simple:

ts
interface LlmProvider {
  generate(params: {
    systemPrompt: string;
    userPrompt: string;
    history: Array<{ role: "user" | "model"; content: string }>;
    temperature: number;
    maxTokens: number;
  }): Promise<string>;
}

Implement that once against whatever LLM your app already uses — OpenAI, Anthropic, Gemini, a local model — and you are done on the runtime side.

Deploying

POST /api/asmi/deploy (src/app/api/asmi/deploy/route.ts) transitions an avatar from draft to deployed. On first deploy it mints the user's MCP API key (format ask_...) if they don't already have one, stamps deployedAt, and flips the status field.

The API key is per user, not per avatar — one key lets a coding AI discover every deployed avatar the user owns. Re-deploys update the definition in-place and leave the key untouched.

Deployment does not provision any runtime service. It only makes the avatar discoverable via the MCP server for the user's own coding AI — the actual chat endpoint is built into the host site when the AI runs the implementation.

The gallery at /apps/asmi/gallery is an opt-in public showcase. Publishing goes through src/components/asmi/ShareDialog.tsxPOST /api/asmi/gallery and requires:

  • Expression completeness (every named expression has a generated image).
  • A captured happy-path narrative (see the Narrative panel in the editor).
  • Title + description (description can be AI-drafted).
  • An SPDX license (defaults MIT).
  • Thumbnail expression (picks the "face of the share").
  • Optional publisher branding — name, URL (auto-normalized to https://), description, logo.
  • derivedFromShareId must be null — cloned avatars cannot be re-shared to prevent laundering.

Publishing produces a share slug, a 1200×630 OG image, a public detail page with a chat replay of the recorded narrative, and a clone button other visitors can use to start their own avatar from yours.

The MCP server

Server URL: https://broen.tech/api/asmi/mcp. Authentication is Authorization: Bearer <USER_MCP_API_KEY> OR x-api-key: <USER_MCP_API_KEY> — whichever header your MCP client exposes. Fully compliant with MCP spec 2025-06-18, RFC 9728 protected resource metadata, and RFC 8707 resource indicators. Works with Claude Code, Claude Desktop, Cursor, Windsurf, Zed, Cline, Lovable, v0, ChatGPT Developer Mode, Replit, and every other MCP client we've seen.

Tools exposed (src/lib/asmi/mcp/tools.ts):

ToolPurpose
list_avatarsList the user's deployed avatars (optional status + search filters).
get_avatarFull sanitized AvatarDefinition. Base64 image data is replaced with per-expression API URLs to keep the payload small.
get_avatar_markdownSame definition in human-readable Markdown.
get_avatar_assetsExpression image URLs + animation metadata (frame count, hold / transition durations, triggers, license).
test_avatarRun a single conversation turn server-side (design-time only).
dispatch_eventEmit a named event (WIDGET_OPENED, HOVER_ENTER, …) against a test session.
get_expected_narrativeThe canonical happy-path recorded in the editor — ground truth for verify_deployment.
verify_deploymentRun the full happy-path end-to-end and return a per-assertion pass/fail diff.
get_runtime_docsInline README for the React + runtime npm packages (usage snippets, option surface, troubleshooting).
get_backend_exampleStarter Next.js / Express / FastAPI backend snippet.
get_embedding_guidePer-avatar implementation recipe — states, expressions, outbound events, copy-paste <AsmiAvatar> mount.

The npm packages

ASMI ships two packages — a React package that's the main integration path for the ~95% of host sites that are React apps, and a framework-agnostic runtime for everything else.

@avatar-state-machine-interface/react wraps the runtime with the correctness-critical UI surface — live mid-turn expression swaps via the runtime's onTrace hook, animation playback for all four trigger types (timer, mouse-enter, click, response-received), idle auto-return, a transparent face region split from the chat shell, auto-detected host theme, form-submit preventDefault so Send doesn't navigate away. The full integration is three imports and one JSX block:

bash
npm install @avatar-state-machine-interface/react \
            @avatar-state-machine-interface/runtime
tsx
"use client";
import { AsmiAvatar, type LlmProvider } from "@avatar-state-machine-interface/react";
import definition from "./my-avatar.json";    // fetched once via MCP get_avatar

const llmProvider: LlmProvider = {
  async generate({ systemPrompt, userPrompt, history, temperature, maxTokens }) {
    // Call OpenAI / Anthropic / Gemini / whatever your site already uses.
    return (await yourLlmClient.complete({ system: systemPrompt, user: userPrompt, history, temperature, maxTokens })).text;
  },
};

export default function Site() {
  return <AsmiAvatar definition={definition} llmProvider={llmProvider} debug />;
}

Drop debug once the expression trail paints the way you expect; the runtime will console.info every state transition, action, expression emission, and LLM call it fires while it's on.

Primitives are also exported for hand-rolled chat surfaces that need file uploads, paste handlers, voice input, or a radically different layout:

  • useAsmiSession(definition, llmProvider, options?) — the correctness-critical session hook. Returns state, history, metadata, sending, send, reset, setHistory, setExpression, bumpActivity.
  • <AsmiFace /> — transparent animated face primitive. Reads an expression prop; plays the configured animation triggers.
  • useHostTheme() / detectHostTheme() — apply the same auto-detected host palette to a custom wrapper.

Package source: packages/asmi-react/. Peer deps: react >=18, react-dom >=18. Runtime dep: @avatar-state-machine-interface/runtime.

Runtime (non-React escape hatch)

@avatar-state-machine-interface/runtime is the framework-agnostic core — a pure state-machine evaluator with zero external runtime dependencies and a single LlmProvider seam. Use this directly if you're on Vue, Svelte, Solid, vanilla JS, or you're evaluating the state machine on a server. React sites should use the React package above, which wraps this under the hood.

bash
npm install @avatar-state-machine-interface/runtime
ts
import { processMessage } from "@avatar-state-machine-interface/runtime";

const result = await processMessage(
  definition,          // AvatarDefinition JSON from MCP get_avatar
  currentState,        // { conversation, expression }
  userMessage,
  { history },
  myLlmProvider,       // your own { generate: ... }
);
// result.response, result.newState.expression, result.outboundEvents

Your UI layer is responsible for rendering the face image, playing animation frames, and wiring user input to processMessage. The React package does all of this; on non-React stacks you do it once.

Package source: packages/asmi-runtime/.

Cost model

Two distinct cost surfaces, with very different ownership:

  • Design-time cost (our bill). Wizard + editor + expression generation + gallery auto-description go through broen.tech's Gemini key. Every call is tagged in token_usage_events with a cost_usd computed at insert time against src/lib/gemini-pricing.ts. The admin dashboard at /admin/asmi/cost aggregates by day / model / pipeline / user / avatar. Empirically: ~$2 per avatar to reach deployed state, ~99% of that is image generation.
  • Runtime cost (host site's bill). Once implemented, the avatar runs on the host site's own LLM key. None of that traffic touches broen.tech. This is why /api/asmi/chat is design-time only and not a public integration surface — the one exception is allowlisted demo avatars (the public Stian avatar on broen.tech).

Integrating from external coding AIs

Coding AIs (Claude Code, Lovable, Cursor, Windsurf, Zed, v0, ChatGPT Developer Mode, Replit…) implement avatars through the MCP server + the React package. The public skill doc at /skills/asmi-implementation.md is the reference they read — it carries per-client MCP config snippets, face-size invariants (≥96 px collapsed / ≥240 px expanded / ≥360 px inline hero), anti-patterns to avoid, and an 11-point verification checklist.

Point a coding AI at two things — this MCP URL plus the skill doc URL — and it will ship a correct implementation into the target repo without you writing a line of glue code. The earlier "hand- roll the state-machine wiring" failure mode is gone: the React package owns the correctness-critical surface, so the coding AI's job is three imports, an LlmProvider, and a <AsmiAvatar> mount.

Source map

The directories below are the fast-paths if you want to dive in:

  • src/components/asmi/ — editor UI (wizard, canvas, panels, dialogs).
  • src/contexts/AsmiEditorContext.tsx — editor state container.
  • src/lib/asmi/ — compilers (brand-voice, markdown serializer / parser), validator, MCP tools, embedding guide, runtime mirror.
  • src/app/api/asmi/ — API surface (sessions, deploy, mcp, gallery, generate-expression, crawl-site, avatars).
  • src/types/asmi.ts — every type you'll care about.
  • packages/asmi-runtime/ — published npm runtime package source.
  • packages/asmi-react/ — published npm React package source (useAsmiSession, <AsmiFace>, <AsmiAvatar>).
  • src/app/skills/asmi-implementation.md/route.ts — the coding-AI skill.
  • src/lib/gemini-pricing.ts + src/app/admin/asmi/cost/ — cost tracking.

Machine-readable variant of this page: /apps/asmi/developers.md.

A product by Broentech Sentinel.

Broentech Sentinel