# AIUGC Layer

AI-generated video has crossed the threshold from novelty to mainstream production tooling. Industry data suggests that AI-generated short-form content now accounts for approximately 40% of new online video, **with output quality increasingly indistinguishable from human-edited media.**&#x20;

Evidence of this shift is already visible at scale — remix-style content built around culturally resonant brands has repeatedly produced hundreds of derivative videos generating multi-million-view traction on platforms like TikTok, exhibiting strong hook rates and high completion ratios.

Viral City abstracts this capability into a structured, asset-bound generation pipeline.

<figure><img src="/files/kTkO4wJH1mlfAv6VBfJG" alt=""><figcaption></figcaption></figure>

***

### **Template-Driven Generation Architecture**

For every on-chain asset, Viral City maintains a curated library of generative templates — parameterized content blueprints that encode proven short-form structures (e.g., hook-first origin clip, meme remix, character POV monologue, narrative recap, reveal trailer, duet response).&#x20;

Each template encapsulates platform-native pacing, aspect ratio, caption cadence, and tonal direction as generation constraints, ensuring outputs conform to the structural patterns empirically correlated with high retention and shareability.

#### **Creation follows an intent-driven flow:**

{% stepper %}
{% step %}
A user selects an on-chain asset.
{% endstep %}

{% step %}
The user selects a template or describes intent in natural language, specifying the joke, the scene, the angle, or the call-to-action.
{% endstep %}

{% step %}
The pipeline produces a platform-ready video that is structurally optimized, tonally on-brand, and semantically bound to the selected asset.
{% endstep %}
{% endstepper %}

***

### **Identity-Consistent Generation via Latent Anchoring**

A core technical challenge in AI-generated brand content is *`character drift`.*&#x20;

{% hint style="info" %}
**Character Drift:** The tendency of generative models to produce visually or tonally inconsistent representations of the same subject across outputs.&#x20;
{% endhint %}

Viral City addresses this through a multi-layered identity conditioning stack. Each on-chain asset is associated with a canonical identity embedding: a composite latent representation derived from reference imagery, style descriptors, and brand-defined attribute constraints.&#x20;

During generation, this embedding is injected via cross-attention conditioning, anchoring the diffusion process to the asset's **visual and narrative identity.**&#x20;

<figure><img src="/files/9TD9GE1SGSdXLbW4NhJy" alt=""><figcaption></figcaption></figure>

Supplementary LoRA modules, fine-tuned per asset or asset class, enforce stylistic coherence across output variants, ensuring that whether a character appears in a meme remix or a cinematic trailer, it remains recognizably and verifiably *the **same entity*****.**

***

### **Voice Layer**

Visual consistency alone is insufficient for full character fidelity; voice is the other half of identity.&#x20;

Viral City integrates **ElevenLabs** as the voice synthesis backbone, giving users access to an industry-leading **TTS engine** with **extensive customization capabilities**.&#x20;

This integration ensures that voice output is production-grade out of the box while remaining highly flexible: the same asset can speak in a punchy, high-energy register for a meme remix and shift to a calm, narrative tone for a recap, **all while retaining a consistent and recognizable vocal identity**.&#x20;

***

### **Virality Optimization Layer**

Beyond visual and auditory fidelity, the pipeline integrates a retention-aware generation strategy.&#x20;

Templates are not static and are continuously informed by engagement signal feedback (view-through rates, share ratios, remix frequency) aggregated across the platform.&#x20;

As a result, any user, regardless of editing skill or creative background, can generate studio-grade, visually and vocally identity-consistent, algorithmically competitive short-form video — directly tied to on-chain assets — in a single interaction.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://city-protocol.gitbook.io/docs/attention-as-a-servcie/aiugc-layer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
