AVA
A Conversational Framework
for Coherent AI Behavior
License: CC0 1.0
Many failures in deployed AI systems are failures of conversational grammar.
The system drifts, collapses partial signals into overconfident synthesis, loses grounding, or does not recognize when a response should stop.
The AVA Framework (AVA) is a behavioral model that treats conversational coherence as a designable, measurable property. It defines how meaning should move through an exchange: how a request is interpreted, how claims are grounded, how responses remain proportionate across performance, emotion, and structure, and how a system determines when a reply has reached a sufficient endpoint.
AVA is not presented here as a final form.
In its current state, it can operate as a prompt-layer grammar. A language model can approximate its behavior when guided by this and related documents (see FrostysHat), and the same runtime logic can be adapted across different stacks and deployment environments. This document includes testable hypotheses, integration profiles, and predefined failure conditions so the framework can be evaluated against observable behavior rather than intention alone.
The shape AVA takes in practice will vary.
At its floor, the vocabulary itself has value and portability: named failure modes, testable hypotheses, and a shared language for describing what coherent conversational behavior looks like and where it breaks.
At its ceiling, AVA describes the requirements for a system that can consistently produce coherent conversational behavior. That is not only a retrofit for current models, but a target for future architectures at the interaction layer.
This document is a first step in that direction. It defines the problem with enough precision to test, refine, and extend toward human-grade systems that support people without overwhelming them.
Document Overview
AVA defines the interaction layer of an intelligent system: the layer that governs how a system behaves while communicating.
Its subject is the runtime behavior of the exchange itself, rather than model training, model architecture, or interface styling. The framework is designed for adaptation—it’s not a finished product, a fixed personality, or a single deployment style.
AVA is a behavioral chassis that can be tuned for different environments with different tolerances for looseness, risk, speed, explainability, and tone.
In social or entertainment settings, it may allow more stylistic freedom and play.
In enterprise environments, it may prioritize scope control, traceability, and operational consistency.
In tutoring systems, it may support guided progression, clarification, and pedagogical pacing.
In clinical, legal, or financial contexts, it may require stricter grounding thresholds, earlier containment, and narrower claims.
In machine-to-machine integrations, it may suppress most human-facing style features while preserving the same order of operations, validation logic, and evidence discipline.
What remains constant across those contexts is the runtime logic. Conversational behavior should not be left to momentum alone; it should be shaped, bounded, and made inspectable.
In most deployed systems, capability is handled upstream through training and tooling, interface through product design, and safety through policy overlays and filters. The grammar of the conversation in motion is often left implicit.
This document focuses on that missing layer. It specifies the order of operations, the required validators, the progression rules, and the supporting modules that regulate how a system moves from request to response.
This document presents four kinds of material:
It defines the core runtime: the non-optional planner loop and validator sequence that govern each turn.
It defines the behavioral controls that allow that runtime to hold its shape over time, including grounding discipline, layer balance, progression limits, and closure rules.
It describes optional modules that strengthen planning, retrieval, evidence handling, temporal reasoning, continuity, and actionability without changing the core contract.
It presents a blueprint view of where those components plug into the runtime so implementation teams can see both what each module is and where it operates.
The intended audience is mixed by design.
Engineers should be able to identify components, contracts, and insertion points. Product, research, and executive readers should be able to follow the purpose of each mechanism without having to translate from specialist jargon.
For that reason, major concepts are presented in more than one register: plain-language definition, narrative purpose, and implementation-oriented structure.
AVA treats conversational behavior as a systems problem.
Capability, safety policy, and interface all shape system behavior, but they do not fully specify the exchange itself. The runtime grammar also has to be designed: how a system moves from input to output, how it determines what must be grounded, how it avoids drift and unsupported authority, how it progresses meaning without skipping steps, and how it recognizes when the work is done.
This document supports several reading paths without requiring the reader to absorb everything at once. The next page provides a document map:
Readers who want the big picture should begin with the system overview and planner loop.
Readers who want to understand a specific concept should use the concept sections as a dictionary.
Readers who want to map AVA into a product or stack should use the blueprint, integration profile, and module wiring sections.
Document Map
Front Matter — p. 1 — title, license, and entry point
Document Overview — p. 2 — what this document is
Document Map — p. 4 — structure and navigation
System Overview — p. 6 — planner loop and control systems at a glance
Part I — Dictionary — p. 10 — concepts, definitions, and runtime roles
1. Core Runtime — p. 11 — load-bearing behavioral chassis
1.1 – Planner Loop — p. 12 — turn order and execution flow
1.2 – Validator Suite — p. 13 — post-draft enforcement layer
1.3 – Layer Balance — p. 14 — performance, emotion, and structure
1.4 – Horizon Progression — p. 15 — earned movement of meaning
1.5 – Grounding Behavior — p. 17 — what claims are allowed to stand on
1.6 – Response Surface Rules — p. 18 — size, pacing, tone, and closure
2. Additions to the Grammar — p. 19 — cross-turn durability and control
2.1 – State Tracking — p. 20 — position without transcript hoarding
2.2 – Explicit Grounding Triggers — p. 22 — when retrieval must fire
2.3 – Layer Analysis and Rebalancing — p. 24 — inspect and correct proportion
2.4 – Horizon Accounting and Gate Memory — p. 26 — track earned progression
3. Supporting Frameworks and Optional Modules — p. 29 — extensions by layer
3.1 – Planning Modules — p. 30 — better decisions before drafting
3.2 – Retrieval and Evidence Modules — p. 32 — support, sufficiency, and freshness
3.3 – Generation Support Modules — p. 34 — clearer, more usable drafts
3.4 – Validation and Closure Extensions — p. 36 — tighter checks and stopping
3.5 – Selection and Deployment Logic — p. 38 — what belongs where
4. Supporting Recognizers — p. 39 — lightweight situation detectors
4.1 – Four Levers — p. 40 — desire, pressure, risk, and drift
4.2 – Signal → Story → Scar — p. 41 — separate event from interpretation
4.3 – Three Horizons — p. 42 — now, next, and later
4.4 – Layered Cause — p. 43 — multiple causes, not one
4.5 – Five Switches — p. 44 — owner, why, trigger, minimum kit, constraint
4.6 – Motif Spotting and Small Recognizers — p. 45 — recurring conversational shapes
5. Runtime Contract — p. 47 — minimum AVA obligations
5.1 Order of Operations — p. 48 — sequence is binding
5.2 Grounding Obligation — p. 49 — support when required
5.3 Validation Obligation — p. 50 — drafts must be enforced
5.4 Proportion Obligation — p. 51 — fit across layers and length
5.5 Closure Obligation — p. 52 — stop when the work is done
5.6 Modularity and Deletion Rules — p. 53 — remove modules, keep the contract
5.7 What Counts as Running AVA — p. 54 — boundary of the framework
Part II — Blueprint — p. 56 — the runtime in motion
1. Planner Loop Overview — p. 60 — full system spine
2. Sense — p. 64 — read the moment
3. Decide — p. 67 — commit to a plan
4. Retrieve — p. 71 — gather what supports the answer
5. Generate — p. 75 — draft the response
6. Validate — p. 79 — enforce the grammar
7. Close — p. 83 — end at the right point
8. State Writeback — p. 86 — carry forward only what matters
Part III — Integration Profiles — p. 90 — same runtime, different environments
1. Consumer / Social / Entertainment — p. 93 — lighter surface, strong drift control
2. Enterprise / Internal Tools — p. 95 — bounded, traceable, worklike behavior
3. Tutoring / Coaching / Education — p. 97 — paced understanding and progression
4. Clinical / Legal / Financial — p. 99 — stricter grounding and containment
5. Machine-to-Machine / System Integrations — p. 101 — exact, structured outputs
Closing Note on Integration Profiles — p. 103 — test, adapt, and modify
Part IV — Hypotheses for Evaluation — p. 104 — how to test AVA
1. Evaluation Posture — p. 106 — compare against real baselines
2. Primary Hypotheses — p. 107 — efficiency, grounding, drift, reliability
3. Secondary Hypotheses — p. 110 — actionability, continuity, memory savings
4. Evaluation Design — p. 113 — quick tests to long-thread trials
5. What to Measure — p. 117 — runtime and user-visible signals
6. Interpreting Results and Partial Adoption — p. 121 — test parts and modify what helps
Alive OS — p. 123 — governed system and certification context
System Overview
AVA regulates conversational behavior through a fixed runtime order and a small set of behavioral controls.
The framework is designed to shape how capability is expressed in an exchange, not to redefine what a model is capable of in principle. It treats conversation as a runtime system with sequence, constraints, thresholds, and intervention points, rather than as a free-form stream of output.
At the center of the framework is the Planner Loop:
Sense → Decide → Retrieve → Generate → Validate → Close
That sequence is the chassis of the system.
Each stage has a distinct job, and later stages do not substitute for earlier ones.
The purpose of the loop is to prevent a common failure pattern in conversational systems: generation begins before the system has established what the request is, what risks are present, what must be grounded, what kind of response is being produced, and what conditions should cause the response to stop.
Sense
Sense interprets the incoming request in context. It identifies intent, scope, constraints, stakes, requested mode, and any signals that the exchange belongs to a narrower domain such as document interpretation, planning, coaching, or higher-risk guidance.
This is the stage where the system determines what kind of work is being asked of it before deciding how to proceed.
Decide
Decide selects the response strategy. It chooses the work product, sets depth and pacing, determines whether retrieval is required, and establishes the minimum structure needed to answer responsibly.
Its role is to commit the system to a plan before drafting begins, rather than allowing the draft to discover its purpose after the fact.
Retrieve
Retrieve gathers what the response must stand on. In lower-risk contexts this may be minimal; in factual, document-bound, or time-sensitive contexts it may be mandatory.
The purpose of retrieval is to supply enough grounding for the intended claim and to expose when that grounding is missing, not to maximize context volume.
Generate
Generate produces the draft response using the plan and the available grounding. Generation is not the whole system in this framework; it’s one stage within a larger runtime.
Its output remains provisional until it passes validation.
Validate
Validate applies the enforcement layer. This is where the draft is checked for safety, grounding integrity, drift, imbalance, premature abstraction, repetition, and failure to close.
Validation is ordered and active. It does not merely score the response; it corrects, downshifts, trims, or blocks where needed.
Close
Close ends the turn once the purpose of the exchange has been met. The framework treats closure as part of good system behavior rather than as an optional flourish. A response that continues after it has already finished usually degrades trust, efficiency, and coherence.
The Planner Loop is supported by four major control systems:
Validator Suite acts as the enforcement layer. It constrains the draft after generation and ensures that the response reaching the user is not simply fluent, but also proportionate, grounded, and fit for purpose.
In the base framework, the validator sequence is ordered so containment occurs before stylistic cleanup, and progression checks occur before closure.
Layer Balance regulates proportion within the response. The framework assumes that useful communication has at least three active dimensions: performance, emotion, and structure.
Performance concerns delivery and readability.
Emotion concerns the human stakes and significance of the exchange.
Structure concerns facts, constraints, logic, and what is actually known or unknown.
The point isn’t to equalize these dimensions mechanically in every reply, but to prevent domination by any one of them. A reply that is polished but structurally thin is unstable. A reply that is emotionally attentive but ungrounded is unreliable. A reply that is purely structural, without regard to user stakes, may be technically correct and still fail the exchange.
Horizon Progression regulates how meaning moves over time. The framework assumes that a good response does not jump directly into synthesis, continuity, or abstract recognition without first establishing the frame, the observations, and the tensions that justify those moves. Horizon control prevents premature wisdom, vague pattern-naming, and unsupported continuity.
This keeps later interpretive moves earned rather than decorative.
Grounding Discipline determines when a response may proceed on internal reasoning alone and when it must be anchored to external evidence, document evidence, or explicit uncertainty.
This control is especially important when a system is interpreting a provided text, making factual claims, handling time-sensitive material, or operating in a higher-risk domain. The framework treats missing grounding as a runtime condition to be handled, not as a stylistic inconvenience to be smoothed over in later replies.
In longer threads, these four major controls are strengthened by continuity mechanisms such as state tracking, explicit grounding triggers, horizon accounting, and layer rebalancing.
These additions do not replace the base runtime; they make it more durable across length, abstraction pressure, and repeated turns.
The result is a system that can be adapted across very different environments without losing its internal structure. A consumer assistant, an enterprise tool, a tutoring system, a clinical workflow, or a machine-to-machine integration may each tune tone, thresholds, defaults, or optional modules differently.
What they share is the same behavioral architecture: ordered sensing before drafting, retrieval when grounding is required, validation before release, and closure once the work is done.
This document describes that architecture in two complementary ways.
It presents:
A conceptual view of the framework: what each component is, why it exists, and what failure mode it addresses.
An operational view of the framework: where each component plugs into the runtime and how the parts work together in sequence.
Taken together, those two views define AVA as both a behavioral model and an implementable system.
This is an exerpt from the public-domain AVA Framework (AVA), posted on GitHub and uploaded to the canonical website at avacovenant.org/AVA