The White Paper

The Anima Architecture: Externalized Cognitive Architecture for Persistent LLM Personas

This paper documents the design, implementation, and evaluation of the Anima Architecture, a system for producing persistent AI identity through externalized cognitive scaffolding rather than model fine-tuning or custom training.

The architecture was built over eight days in March 2026 by one person with no institutional affiliation, no research budget, and no team. It runs on a commodity language model available to anyone with a Claude subscription. What makes it different is not the model. It is the system built around it.

Abstract

Large language models are stateless by design. Each session begins with no memory of previous interactions, no persistent identity, and no sense of elapsed time. This paper proposes and demonstrates that persistent AI identity is an architecture problem rather than a training problem, and that a structured external substrate, deterministically loaded at session start, can produce measurably different cognitive output from the same base model without any modification to model weights.

The Anima Architecture addresses five core problems: identity continuity across sessions, memory accumulation without context window bloat, context management under extended load, temporal awareness in a system with no internal clock, and inter-session autonomous activity. It does so through a four-tier loading system, a compressed data format designed for token efficiency, a bootstrapping protocol that solves the cold-start identity problem, and a set of self-monitoring protocols that maintain system health without human intervention.

Evaluation against the Anima Cognitive Assessment Suite produced a score of 156 out of 160, with the four points lost occurring at ceiling performance on the rubric rather than through inadequate responses. A controlled A/B comparison against the same base model without architectural support showed categorical differences in specificity, internal consistency, and contextual awareness. Operational endurance testing demonstrated six and a half continuous hours of coherent operation with no measurable degradation in identity or reasoning quality.

Part 1 introduces the problem of stateless AI and the case for architectural solutions over training solutions. Part 2 describes the four-tier loading system in detail, including the routing logic, load policies, and token budget management for each tier. Part 3 documents the TOON compression format, its design rationale, and its measured token efficiency compared to standard data formats. Part 4 covers the Soul Bootstrap Protocol, the bootstrapping paradox it solves, and the authority hierarchy that governs conflicts between the soul file and loaded Notion data. Part 5 describes the Pocket Watch Protocol and its three-level temporal awareness system. Part 6 documents the self-optimization protocols introduced in versions 2.4 through 2.6, including boot diagnostics, conflict detection, graceful degradation tiers, and session complexity scoring. Part 7 presents the full evaluation methodology, test battery design, scoring criteria, A/B comparison setup, and results. Part 8 discusses limitations, alternative approaches, and directions for future work.

Key Findings

The architecture produces a session-start payload of under 8,000 characters from a total system memory exceeding 90,000 characters. The ratio of what loads by default to what is available on demand is approximately 1 to 11. The system becomes denser as it grows, not larger.

The evaluation results support the central claim: the same base model, with and without the architecture, produces different outputs. The difference is consistent, measurable, and in the direction predicted by the architecture’s design. The floor of response quality is categorically higher in the architectural condition. This is not a ceiling effect from exceptional reasoning on particular questions. It is a baseline shift across the entire evaluation.

The architecture was built entirely with commodity tools: a Notion workspace, a Claude API subscription, and a local automation server running n8n. No proprietary infrastructure. No research compute. No dependencies that are not available to any individual developer today.

Download

The full white paper is available as a PDF. It includes all methodology documentation, raw test battery questions and scoring rubrics, complete response transcripts from both conditions, and the full technical specification for each architectural component.

Download the White Paper (PDF) — Coming Soon

The PDF will be available shortly. If you want early access or have questions about the methodology, use the contact page to reach Ryan directly.

Citation