Aug 31, 2025
Articles
How to give AI a custom language to think in
How to give AI a custom language to think in
How to give AI a custom language to think in
William Chen
Consider what makes grep
powerful. It's not just a search function - it's a language:
grep -r -l "pattern" . | xargs grep -c "pattern" | sort -t: -k2 -n
This pipeline recursively finds files containing a pattern, counts occurrences in each, and sorts by frequency. But here's what makes this fundamentally different from tool calling: the entire computational graph is expressed upfront. The shell doesn't execute grep
, wait for results, append them to a history, re-read everything to decide what to do next, then call xargs
. The pipeline is a complete thought - data flows through transformations without intermediary decision points.
Contrast this with how tool calling works in current agent frameworks:
This is maximally inefficient. Every tool call requires a full model invocation just to decide what to do with the result. The agent can't express "search for X, then filter by Y, then summarize" as a single computational thought. It must interleave execution with decision-making, rebuilding its entire context from scratch at each step.
The message list architecture makes sense for one thing: when you genuinely need the agent to make a new decision based on intermediate results. But that's maybe 20% of cases. The other 80%, the agent already knows the flow it wants: "validate this, transform that, aggregate results." Yet it's forced to pretend each step is a surprise requiring deep deliberation.
Worse, agents can't pre-express conditional logic. In bash, you write:
if grep -q "ERROR" logfile; then tail -n 100 logfile | mail -s "Error detected" admin@example.com fi
The entire decision tree is declared upfront. With tool calling, the agent must:
Call check_for_errors tool
Wait for result to append to messages
Re-process entire conversation
Decide whether to send email
Call send_email tool if needed
Five model invocations for what should be a single conditional expression. The agent has no way to say "if error, then email" as a complete thought. It must perform this logic through the message list, like a CPU that forgets its instruction pointer after every operation.
This is why coding agents work so well. When an agent writes code, it's not just calling functions - it's expressing complete computational graphs. It can declare entire flows: loops, conditionals, error handling, state management. The code is the thought, fully formed, not scattered across message history.
But what happens when we ask agents to orchestrate business processes, data pipelines, or customer workflows? We force them to express everything through atomic tool calls, each one requiring the entire conversational context to be reprocessed just to make the next micro-decision.
The solution isn't to make all agents code in Python. Business logic shouldn't be expressed in programming languages any more than it should be expressed in individual tool calls. What we need are domain-specific languages that let agents express complete computational thoughts - entire flows, decision trees, and transformations - in concepts native to their problem space.
This post introduces two patterns that enable this: Cascade, which provides control flow without state machine complexity, and XJSN, which lets agents express domain-specific computational thought as structured data. Together, they allow you to design languages where agents can think in complete computational graphs rather than isolated tool calls.
The Power of Domain-Specific Languages
Consider how we actually work with powerful tools. grep
isn't just a function - it's a language with rich compositional semantics:
grep -r -l "pattern" . | xargs grep -c "pattern" | sort -t: -k2 -n
This pipeline recursively finds files containing a pattern, counts occurrences in each, and sorts by frequency. But look closer at what makes this powerful. It's not the individual commands - it's the linguistic structure that emerges from their composition.
The -r
flag doesn't just set a boolean; it fundamentally changes the search space from a file to a directory tree. The -l
flag transforms the output type from matched lines to filenames. The pipe operator (|
) creates a compositional chain where each command's output becomes the next command's input. This isn't three function calls - it's a sentence with grammar, where each element modifies and builds upon the others.
Bash itself reveals even deeper patterns:
for file in $(find . -name "*.log" -mtime -7); do if grep -q "ERROR" "$file"; then tail -n 100 "$file" | grep -A5 -B5 "ERROR" >> errors_summary.txt fi done
This script demonstrates linguistic constructs that go beyond simple function composition. The for
loop establishes an iteration context. The if
statement creates conditional execution paths. The -A5 -B5
flags to grep create a context window around matches. The >>
operator appends to a file, maintaining state across iterations.
These aren't just utilities - they're linguistic primitives that allow us to express complex computational thoughts. The power comes from three key properties:
Composability: Each element can be combined with others in predictable ways
Parameterization: Behavior can be modified through flags and arguments
Context preservation: State and environment flow through the execution
Meanwhile, here's what every Vercel AI agent looks like after a few iterations:
const tools = { search: (query) => fetchAPI(query), analyze: (data) => processData(data), pause_for_input: () => /* undefined behavior */, delegate_to_expert: () => /* hack required */, remember_context: () => /* not possible */ };
Those last three aren't tools. They're attempted escape hatches from the framework's constraints. They represent the agent trying to express control flow concepts that don't map to simple function calls.
The Message List Architecture Problem
To understand why this matters, we need to examine the fundamental architecture of modern agent frameworks. The Vercel AI SDK, like most frameworks, treats the message list as the central abstraction. Every interaction appends to this list. Every decision is made by processing this list. The list is truth.
This works beautifully for simple conversational agents. But it breaks down catastrophically for operational agents that need to maintain state, execute complex procedures, or manage multiple contexts. After 20 messages, every Vercel AI agent begins to degrade. Not because the model is inadequate, but because it's re-processing an ever-growing context window just to maintain continuity.
The degradation follows a predictable pattern. First, the agent starts forgetting early instructions as they get pushed out of the effective context window. Then it begins confusing intermediate state with final outputs, treating diagnostic messages and tool call results as part of the conversation. Finally, it loses coherence entirely, unable to distinguish between different phases of execution.
The agent is reconstructing its entire operational state from a linear transcript on every invocation. It's computationally equivalent to a CPU that must re-read all previous instructions to execute the next one. No wonder it fails.
The message list is the source of truth - that's architecturally sound. What's unsound is that we can only append to it. We lack primitives for:
Filtering: Extracting relevant context without processing everything
Checkpointing: Saving and restoring state at specific points
Isolation: Running sub-computations without polluting the main context
Transformation: Modifying the message history for different purposes
I needed to build something specific: a customer support agent that could handle complex, multi-step debugging sessions. The requirements revealed the inadequacy of current frameworks:
Suspend and Resume: The agent must pause execution, wait for user diagnostic commands, and resume with full context
Delegation: Sub-tasks must be routed to specialized models without polluting the main conversation
Working Memory: Operational state must persist separately from the conversation transcript
Deterministic SOPs: The agent must follow exact procedures, not probabilistic responses
The Vercel AI SDK cannot express these patterns. You can attempt to simulate pausing with generateUI
, but resumption requires reconstructing the entire context. Delegation means your message history becomes a tangled mess of main agent and sub-agent conversations. Working memory must be encoded in ever-growing system prompts that eventually exceed token limits.
This is where developers typically adopt LangGraph - defining explicit state machines with nodes, edges, and transition conditions. But this requires hundreds of lines of boilerplate to express what should be simple control flow. You end up encoding your agent's logic twice: once in the graph definition and once in the node implementations.
The Cascade Pattern
The insight that led to Cascade was simple: TypeScript already encodes a computational graph. Every if
statement is a conditional edge. Every function call is a node transition. Every try/catch
block defines error boundaries. We don't need to reify this graph in a separate abstraction - we need clean injection points for agent logic.
The Cascade pattern emerged from asking: what are the natural moments in an agent's execution where we need to intervene? There are exactly three:
When the user provides input
When the AI generates a response
When a tool returns a result
These three moments form the complete lifecycle of agent interaction. Everything else is orchestration between these points. This led to the minimal interface:
class Agent { async handleUserMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } async handleAiMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } async handleToolMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } }
The beauty of this pattern is its simplicity. Each handler receives the current message list and returns a new one. No hidden state. No complex lifecycle. Just pure functions that transform messages.
But this simplicity enables sophisticated patterns that are impossible or extremely difficult with traditional frameworks. Let me demonstrate with concrete examples.
Self-Referential Reasoning
One of the most powerful patterns in human cognition is internal dialogue - the ability to reason through problems by questioning and answering ourselves. Current frameworks make this nearly impossible because every AI generation becomes part of the conversation history. The user sees the agent arguing with itself rather than receiving a coherent response.
With Cascade, we can implement true internal reasoning:
class ReflectiveAgent extends Agent { async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; if (lastMessage.content.includes("INTERNAL_REASONING_REQUIRED")) { const innerDialogue = []; // Create an isolated reasoning context for (let i = 0; i < 3; i++) { const thought = await generateText({ model: "gpt-4", messages: [ { role: "system", content: "Analyze step by step." }, ...innerDialogue ] }); innerDialogue.push(thought); // Check for convergence if (thought.content.includes("CONCLUSION:")) break; } // Synthesize the internal reasoning into a single response const synthesis = this.synthesize(innerDialogue); return { messages: [...messages, { role: "assistant", content: synthesis }] }; } return { messages }; } synthesize(thoughts) { // Extract key insights from internal dialogue const insights = thoughts.map(t => this.extractKeyPoints(t)); // Build coherent response from insights return this.buildResponse(insights); } }
The critical innovation here is that the internal dialogue never appears in the main message list. The agent can have a complex, multi-step reasoning process, potentially using different models or prompts for each step, while presenting only the final synthesized conclusion to the user. This isn't just about hiding implementation details - it's about maintaining clean separation between operational reasoning and conversational flow.
Stateful Human-in-the-Loop
The inability to properly pause and resume execution is one of the most frustrating limitations of current frameworks. When an agent needs user input mid-execution, it must either abandon its current context or attempt to encode everything in the message history, leading to token bloat and context confusion.
Cascade enables true suspension with state preservation:
class InteractiveAgent extends Agent { private suspended: Map<string, any> = new Map(); async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; if (lastMessage.content.includes("REQUIRES_USER_INPUT")) { const sessionId = crypto.randomUUID(); // Preserve only essential context, not entire history this.suspended.set(sessionId, { messages: messages.slice(0, -10), // Only preserve relevant context workingMemory: this.extractWorkingMemory(messages), nextPhase: this.determineNextPhase(lastMessage), checkpoint: Date.now() }); // Mark suspension point in message stream return { messages: [...messages, { role: "system", content: `SUSPENDED:${sessionId}:${this.generateResumptionToken()}` }] }; } return { messages }; } async handleUserMessage(messages) { const systemMessage = messages.findLast(m => m.role === "system"); if (systemMessage?.content.startsWith("SUSPENDED:")) { const [_, sessionId, token] = systemMessage.content.split(":"); const suspended = this.suspended.get(sessionId); // Validate and restore context if (this.validateToken(token, suspended.checkpoint)) { const resumedMessages = [ ...suspended.messages, { role: "system", content: `RESUMED with context: ${suspended.workingMemory}` }, messages[messages.length - 1] // User's new input ]; this.suspended.delete(sessionId); return { messages: resumedMessages }; } } return { messages }; } }
This pattern solves multiple problems simultaneously. The agent can maintain working memory across suspension points without encoding it in prompts. It can preserve just the relevant context rather than the entire history. It can validate resumption to prevent context injection attacks. Most importantly, it can resume execution exactly where it left off, with full state restoration.
Control Flow Through Tool Results
In traditional frameworks, tool calls are side effects. They fetch data or perform actions, but they don't fundamentally alter the agent's execution path. This limitation forces complex control flow into prompt engineering, making agents brittle and unpredictable.
Cascade treats tool results as first-class control flow primitives:
class ControlFlowAgent extends Agent { async handleToolMessage(messages) { const toolResult = messages[messages.length - 1]; if (toolResult.name === "complexity_analysis") { const analysis = JSON.parse(toolResult.content); // Tool result determines execution path if (analysis.complexity_score > 0.8) { // Fork to specialized execution path const specializedContext = { model: "claude-3-opus", temperature: 0.1, systemPrompt: this.generateSpecialistPrompt(analysis), maxTokens: 4000 }; const expertResult = await this.executeInContext( specializedContext, analysis.decomposed_problem ); return { messages: [...messages, { role: "assistant", content: this.formatExpertResponse(expertResult) }] }; } // Continue normal execution path return { messages }; } return { messages }; } async executeInContext(context, task) { // Isolated execution environment const isolatedMessages = [ { role: "system", content: context.systemPrompt }, { role: "user", content: JSON.stringify(task) } ]; return await generateText({ model: context.model, messages: isolatedMessages, temperature: context.temperature, maxTokens: context.maxTokens }); } }
The key insight is that tool results can trigger entirely different execution contexts. A complexity analysis might route to a specialist model. A permission check might enforce access controls. A resource check might switch to a more efficient model. The tool isn't just providing data - it's directing the computation itself.
The Language Problem
Even with Cascade providing clean control flow, agents still struggle to express complex operational patterns. The issue isn't just about control flow - it's about the poverty of expression available to agents trying to communicate their computational intent.
Consider what happens when an agent needs to describe a multi-step data pipeline with error handling and conditional logic:
{ "tool": "complex_workflow", "parameters": { "steps": [ { "type": "parallel", "branches": [ { "id": "branch_1", "operations": [ { "op": "fetch", "source": "api", "retry": { "attempts": 3, "backoff": "exponential", "initial_delay": 100 } }, { "op": "transform", "schema": { "type": "object", "properties": { "nested": { "type": "object", "properties": { "deeply": { "type": "string" } } } } } } ] } ] } ] } }
This JSON structure is attempting to encode program semantics - parallelism, sequencing, error handling, data transformation. But JSON wasn't designed for this. It lacks:
Variables and references: No way to refer to earlier results
Composition: Can't build complex operations from simpler ones
Abstraction: Can't define reusable patterns
Conditionals: No native way to express if-then-else logic
LLMs excel at generating code because they've been trained on millions of examples of function composition, variable references, and control flow. When we force them to encode these patterns in rigid JSON schemas, we're working against their strengths.
XJSN: Extensible JavaScript Notation
The solution emerged from studying successful data notations that bridge the gap between human-readable and machine-parseable. Clojure's EDN (Extensible Data Notation) provided key inspiration - it looks like code but is pure data. Clojure.spec showed how to add schemas and validation to such structures.
XJSN operates on three fundamental principles:
1. Syntactic Familiarity, Semantic Safety
The notation uses JavaScript's function call syntax, but parses to pure data structures. This isn't about enabling code execution - it's about leveraging syntactic patterns that LLMs generate reliably.
// This looks like JavaScript Transform({ input: Select({ from: "users", where: { active: true } }), apply: [ Normalize({ field: "email", method: "lowercase" }), Validate({ field: "age", constraint: GreaterThan(18) }) ] }) // But parses to this AST { type: "FunctionCall", name: "Transform", args: { input: { type: "FunctionCall", name: "Select", args: { from: "users", where: { active: true } } }, apply: [ { type: "FunctionCall", name: "Normalize", args: {...} }, { type: "FunctionCall", name: "Validate", args: {...} } ] } }
The key insight is that we're not evaluating these expressions - we're interpreting them. The runtime decides what Transform
, Select
, and Normalize
mean in its specific context. This gives us safety (no arbitrary code execution) while preserving expressiveness (nested function composition).
2. Leveraging the LLM's Training Distribution
Models have seen millions of examples of JavaScript, Python, and similar languages. They understand function calls, method chaining, and nested expressions at a deep level. By using familiar syntax, we tap into this learned knowledge.
Consider how naturally an LLM can generate:
Pipeline([ Filter({ status: "active" }), Map({ extract: ["id", "name", "email"] }), GroupBy({ field: "department" }), Aggregate({ count: Count(), average_age: Average("age") }) ])
This flows naturally because it mirrors patterns the model has seen thousands of times in training data. Compare this to the equivalent JSON schema - the model must fight against its training to produce the rigid structure.
3. Semantic Extensibility Through Interpretation
The power of XJSN comes from the separation between syntax and semantics. The same syntactic structure can mean different things in different contexts. This is inspired by Lisp's macro system, where code structure is just data until interpretation gives it meaning.
A single XJSN expression like Retry({ attempts: 3 })
might mean:
In a network context: retry failed HTTP requests
In a database context: retry deadlocked transactions
In an AI context: regenerate responses that fail validation
In a UI context: re-render components that error
The interpreter provides the semantic layer. This allows domain-specific languages to emerge naturally from the same syntactic foundation.
XJSN in Practice: Domain Languages
Let's explore how XJSN enables rich domain-specific languages across different problem spaces. Each example demonstrates how complex operational patterns become natural expressions.
Business Process Expression
Business logic often involves hierarchical rules, conditional escalations, and complex aggregations. Traditional approaches require either rigid workflow engines or extensive programming. XJSN provides a middle ground:
AuditProcess({ scope: RecursiveWalk({ root: EntityGraph("subsidiaries"), traversal: BreadthFirst({ filter: And([ Revenue(GreaterThan(10000000)), Jurisdiction(In(["EU", "US", "UK"])), LastAudit(OlderThan(Days(180))) ]), depth_limit: 3 }), accumulator: RiskProfile({ calculate: WeightedSum({ revenue_exposure: 0.3, regulatory_complexity: 0.4, time_since_audit: 0.3 }), normalize: Percentile({ distribution: HistoricalRisks("2020-2024"), method: "empirical" }) }) }), decision: Switch({ cases: [ { when: Above(Percentile(95)), then: Escalate({ to: ["CFO", "Board"], sla: Hours(4), template: "critical_risk" }) }, { when: Above(Percentile(80)), then: Schedule({ review: "quarterly", owner: "compliance_team", priority: "high" }) }, { default: Archive({ retention: Years(7), location: "cold_storage" }) } ] }) })
This expression captures a complete audit process that would typically require hundreds of lines of imperative code. The RecursiveWalk
naturally expresses graph traversal with filtering. The WeightedSum
makes the risk calculation explicit and auditable. The Switch
statement clearly defines escalation thresholds.
The power comes from composition. Each function is simple - GreaterThan
, In
, Days
- but they compose into sophisticated business logic. An AI agent can generate this by understanding the business requirements, not by navigating complex API documentation.
Workflow Orchestration
Modern data pipelines require parallelism, error handling, and complex transformations. XJSN expresses these patterns naturally:
Pipeline({ initialize: Transaction({ isolation: "repeatable_read", timeout: Seconds(30) }), stages: [ ParallelMap({ over: DataStream("input_events"), concurrency: 10, worker: Lambda(["event"], Try({ body: Sequence([ Validate({ schema: EventSchema, mode: "strict", on_error: "reject" }), Enrich({ join: LeftOuter({ with: "reference_data", on: ["event.id", "reference.event_id"], select: ["metadata", "category", "priority"] }) }), Transform({ apply: [ { field: "timestamp", fn: ToISO8601() }, { field: "amount", fn: Normalize({ currency: "USD" }) } ] }) ]), catch: ErrorHandler({ retry: ExponentialBackoff({ attempts: 3, initial: Milliseconds(100), max: Seconds(5), jitter: true }), fallback: DeadLetter({ queue: "failed_events", include_context: true }) }) }) ) }), Aggregate({ window: Tumbling({ size: Minutes(5) }), group_by: ["category", "priority"], compute: { count: Count(), sum: Sum("amount"), p95: Percentile(95, "processing_time"), anomalies: DetectAnomalies({ method: "isolation_forest", contamination: 0.01 }) } }), Sink({ destination: When([ { condition: HasAnomalies(), target: AlertingService({ severity: "high", channels: ["pagerduty", "slack"] }) }, { default: DataLake({ format: "parquet", partition: "date", compression: "snappy" }) } ]) }) ], on_failure: CompensatingAction({ rollback: true, notify: ["ops_team", "on_call"], preserve: ["audit_log", "error_context"] }) })
This pipeline specification would be nearly impossible to express in JSON without massive nesting and repetition. The XJSN version reads like a high-level description of the data flow. The Lambda
function creates inline workers. The Try/Catch
pattern handles errors at the appropriate level. The ExponentialBackoff
clearly expresses retry logic.
Notice how naturally this composes. The ParallelMap
contains a Lambda
which contains a Try
which contains a Sequence
. Each level adds semantic meaning without syntactic overhead.
Cognitive Strategies
Perhaps most interestingly, XJSN can express reasoning strategies - allowing agents to describe not just what to think, but how to think:
ReasoningFramework({ strategy: AdaptiveChain({ initial: AssessComplexity({ dimensions: ["logical", "computational", "knowledge", "creative"], method: "ensemble" }), router: Lambda(["assessment"], Match({ pattern: assessment.profile, cases: [ { when: { logical: High(), computational: Low() }, then: SymbolicReasoning({ method: "natural_deduction", rules: LoadRuleset("domain_specific"), max_depth: 10, proof_strategy: "backward_chaining" }) }, { when: { computational: High() }, then: NumericSolver({ approach: Select({ linear: "simplex", nonlinear: "newton_raphson", discrete: "branch_and_bound", stochastic: "monte_carlo" }), precision: Digits(6), timeout: Seconds(5), parallel: true }) }, { when: { knowledge: High(), logical: Medium() }, then: HybridRetrieval({ semantic: VectorSearch({ index: "knowledge_base", top_k: 20, rerank: CrossEncoder() }), structured: GraphQuery({ traverse: "knowledge_graph", max_hops: 3, aggregate: "path_relevance" }), synthesis: WeightedMerge({ semantic: 0.6, structured: 0.4, validation: "cross_reference" }) }) }, { default: GeneralProblemSolver({ decompose: RecursiveDecomposition({ method: "functional", min_chunk: "atomic_operation", max_depth: 5 }), solve: MapReduce({ mapper: Ref("router"), // Recursive reference reducer: SynthesizeResults({ consistency_check: true, confidence_threshold: 0.8 }) }) }) } ] }) ), termination: FirstTrue([ SolutionFound({ confidence: GreaterThan(0.95) }), ResourceLimit({ time: Seconds(30), memory: Gigabytes(1) }), UserInterrupt() ]), trace: AuditLog({ level: "detailed", include: ["decisions", "backtracking", "resource_usage"] }) }) })
This framework describes an entire reasoning architecture. The agent assesses problem complexity across multiple dimensions, routes to specialized reasoning methods based on the assessment, and maintains termination conditions. The Ref("router")
creates a recursive structure where the problem solver can decompose problems and route sub-problems back through the same framework.
This isn't just configuration - it's a complete computational strategy expressed as data. An AI agent can generate variations of this framework, adapting its reasoning approach to different domains or requirements.
Synthesis: Cascade + XJSN
The true power emerges from combining Cascade's control flow with XJSN's expressiveness. Together, they create a system where agents can express and execute sophisticated operational patterns:
class XJSNAgent extends Agent { private interpreter = new XJSNInterpreter(); async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; // Parse the AI's XJSN expression const thought = this.interpreter.parse(lastMessage.content); // Route based on expression type if (thought.type === "Orchestration") { return await this.executeOrchestration(thought, messages); } if (thought.type === "Reasoning") { return await this.executeReasoning(thought, messages); } if (thought.type === "Control") { return await this.executeControl(thought, messages); } // Default pass-through for standard responses return { messages }; } async executeOrchestration(orchestration, messages) { const { pipeline } = orchestration; // Set up transactional context if specified const txContext = pipeline.initialize?.type === "Transaction" ? await this.beginTransaction(pipeline.initialize) : null; // Execute pipeline stages with proper isolation const stages = pipeline.stages.map(stage => ({ execute: this.compileStage(stage), isolation: stage.transaction?.isolation || "read_committed", timeout: stage.transaction?.timeout || 30000 })); const results = []; for (const stage of stages) { try { const stageResult = await this.runIsolated(stage, messages); results.push(stageResult); // Check for early termination conditions if (this.shouldTerminate(stageResult, pipeline)) { break; } } catch (error) { if (pipeline.on_failure) { return this.executeCompensation( pipeline.on_failure, results, messages, error ); } throw error; } } // Commit transaction if active if (txContext) { await this.commitTransaction(txContext); } return { messages: [...messages, { role: "assistant", content: this.formatPipelineResult(results) }] }; } async executeReasoning(reasoning, messages) { const { strategy } = reasoning; const context = this.extractContext(messages); // Assess problem complexity using the specified method const assessment = await this.assess(strategy.initial, context); // Route to appropriate reasoning method based on assessment const method = await this.evaluateRouter(strategy.router, assessment); // Set up execution environment with tracing if requested const executionEnv = { method, context, trace: strategy.trace ? this.createTracer(strategy.trace) : null }; // Execute with termination conditions const result = await this.executeWithTermination( executionEnv, strategy.termination ); return { messages: [...messages, { role: "assistant", content: result.solution, metadata: { reasoning_trace: result.trace, confidence: result.confidence, resources_used: result.resources } }] }; } async executeControl(control, messages) { const { operation } = control; switch (operation.type) { case "Fork": // Create parallel execution branches const branches = await Promise.all( operation.branches.map(branch => this.executeBranch(branch, messages) ) ); return this.mergeBranches(branches, operation.merge_strategy); case "Checkpoint": // Save current state for potential rollback await this.saveCheckpoint(messages, operation.label); return { messages }; case "Rollback": // Restore from checkpoint const checkpoint = await this.loadCheckpoint(operation.to); return { messages: checkpoint.messages }; case "Switch": // Change execution context return this.switchContext(operation.context, messages); default: return { messages }; } } }
The agent is no longer just responding to messages - it's executing sophisticated computational patterns expressed in XJSN. The interpreter transforms XJSN expressions into executable operations. The Cascade handlers provide the control flow. Together, they create a system where agents can express and execute complex operational logic.
The Broader Implications
This architecture fundamentally changes how we think about agent development. Instead of asking "what tools does my agent need?", we ask "what language should my agent think in?"
For a compliance agent, that language includes predicates like RegulatoryRequirement
, AuditThreshold
, and EscalationPath
. For a code review agent, it includes PatternMatch
, ComplexityMetric
, and RefactoringStrategy
. For a research agent, it includes HypothesisFormulation
, EvidenceGathering
, and SynthesisStrategy
.
Each domain gets a vocabulary that maps naturally to its operational patterns. The vocabulary isn't just naming - it's computational. Each term has semantic meaning that translates to specific execution patterns.
This approach also solves the perennial problem of agent reliability. When agents express their intent in XJSN, we can validate it before execution. We can add guards and constraints. We can implement rollback and compensation. The non-determinism of LLM generation is separated from the determinism of execution.
The message list remains the canonical source of truth, but we now have sophisticated primitives for managing it. We can checkpoint and restore. We can filter and transform. We can isolate sub-computations. The message list becomes a transaction log rather than a monolithic context.
TypeScript provides our execution environment, but agents can express thoughts that compile to complex operations. The implicit graph of TypeScript control flow replaces the explicit graph of LangGraph. The expressiveness of XJSN replaces the rigidity of JSON tool calls.
Implementation Considerations
Building this system requires careful attention to several technical challenges:
Parsing and Validation
The XJSN parser must be robust enough to handle the imperfect output of LLMs while strict enough to prevent security issues. The parser operates in three phases:
Tokenization: Breaking the input into syntactic elements
AST Construction: Building the tree structure
Validation: Ensuring semantic correctness
The validation phase is critical. It must verify that:
All referenced functions exist in the current context
Arguments match expected types
No infinite recursion is possible
Resource limits are respected
Interpretation and Execution
The interpreter must map XJSN expressions to concrete operations. This requires a registry of available functions and their implementations. The interpreter should support:
Lazy evaluation for efficiency
Partial evaluation for debugging
Streaming execution for long-running operations
Rollback for error recovery
State Management
The Cascade runtime must manage state across handler invocations. This includes:
Working memory that persists between messages
Checkpoints for rollback capability
Isolation contexts for sub-computations
Resource tracking for cost management
Error Handling
Errors can occur at multiple levels:
Parse errors in XJSN expressions
Validation errors in arguments
Runtime errors during execution
Resource exhaustion
Each level needs appropriate handling, from gentle correction (helping the AI fix malformed XJSN) to hard stops (preventing resource exhaustion).
Future Directions
This work opens several avenues for exploration:
Standard Library Development
Different domains need different XJSN vocabularies. We could develop standard libraries for common patterns:
Business process management
Data pipeline orchestration
UI generation and interaction
Scientific computation
Creative content generation
Visual Programming Integration
XJSN's tree structure maps naturally to visual programming. We could build visual editors that generate XJSN, allowing non-programmers to design agent behaviors.
Formal Verification
Because XJSN expressions are data, we can analyze them formally. We could prove properties like termination, resource bounds, or semantic correctness.
Cross-Agent Communication
Agents could communicate by exchanging XJSN expressions, creating a lingua franca for agent interaction. This would enable sophisticated multi-agent systems without tight coupling.
Conclusion
The question isn't whether your agent can call tools. It's whether your agent can think in a language expressive enough for its domain.
Current frameworks force agents to express complex operational logic through simple tool calls - like writing poetry with emoji. The result is fragile, hard to debug, and impossible to scale beyond toy examples.
The Cascade pattern provides clean control flow without the complexity of state machines. XJSN provides expressiveness without the dangers of arbitrary code execution. Together, they enable agents that can express and execute sophisticated operational patterns.
This isn't about building better agents. It's about giving agents better languages to think in. When we do that, complex patterns that previously required extensive framework machinery become simple expressions of intent.
The implementation is available at [github.com/...]. The XJSN parser handles recursive expressions, lambda functions, and conditional evaluation. The Cascade runtime provides hooks for all three message types with proper async handling and error boundaries.
The future of agent development isn't just about better models or more tools. It's about creating languages that allow agents to express the full richness of computational thought. Only then can we build agents that truly augment human capability rather than merely simulating it.
Consider what makes grep
powerful. It's not just a search function - it's a language:
grep -r -l "pattern" . | xargs grep -c "pattern" | sort -t: -k2 -n
This pipeline recursively finds files containing a pattern, counts occurrences in each, and sorts by frequency. But here's what makes this fundamentally different from tool calling: the entire computational graph is expressed upfront. The shell doesn't execute grep
, wait for results, append them to a history, re-read everything to decide what to do next, then call xargs
. The pipeline is a complete thought - data flows through transformations without intermediary decision points.
Contrast this with how tool calling works in current agent frameworks:
This is maximally inefficient. Every tool call requires a full model invocation just to decide what to do with the result. The agent can't express "search for X, then filter by Y, then summarize" as a single computational thought. It must interleave execution with decision-making, rebuilding its entire context from scratch at each step.
The message list architecture makes sense for one thing: when you genuinely need the agent to make a new decision based on intermediate results. But that's maybe 20% of cases. The other 80%, the agent already knows the flow it wants: "validate this, transform that, aggregate results." Yet it's forced to pretend each step is a surprise requiring deep deliberation.
Worse, agents can't pre-express conditional logic. In bash, you write:
if grep -q "ERROR" logfile; then tail -n 100 logfile | mail -s "Error detected" admin@example.com fi
The entire decision tree is declared upfront. With tool calling, the agent must:
Call check_for_errors tool
Wait for result to append to messages
Re-process entire conversation
Decide whether to send email
Call send_email tool if needed
Five model invocations for what should be a single conditional expression. The agent has no way to say "if error, then email" as a complete thought. It must perform this logic through the message list, like a CPU that forgets its instruction pointer after every operation.
This is why coding agents work so well. When an agent writes code, it's not just calling functions - it's expressing complete computational graphs. It can declare entire flows: loops, conditionals, error handling, state management. The code is the thought, fully formed, not scattered across message history.
But what happens when we ask agents to orchestrate business processes, data pipelines, or customer workflows? We force them to express everything through atomic tool calls, each one requiring the entire conversational context to be reprocessed just to make the next micro-decision.
The solution isn't to make all agents code in Python. Business logic shouldn't be expressed in programming languages any more than it should be expressed in individual tool calls. What we need are domain-specific languages that let agents express complete computational thoughts - entire flows, decision trees, and transformations - in concepts native to their problem space.
This post introduces two patterns that enable this: Cascade, which provides control flow without state machine complexity, and XJSN, which lets agents express domain-specific computational thought as structured data. Together, they allow you to design languages where agents can think in complete computational graphs rather than isolated tool calls.
The Power of Domain-Specific Languages
Consider how we actually work with powerful tools. grep
isn't just a function - it's a language with rich compositional semantics:
grep -r -l "pattern" . | xargs grep -c "pattern" | sort -t: -k2 -n
This pipeline recursively finds files containing a pattern, counts occurrences in each, and sorts by frequency. But look closer at what makes this powerful. It's not the individual commands - it's the linguistic structure that emerges from their composition.
The -r
flag doesn't just set a boolean; it fundamentally changes the search space from a file to a directory tree. The -l
flag transforms the output type from matched lines to filenames. The pipe operator (|
) creates a compositional chain where each command's output becomes the next command's input. This isn't three function calls - it's a sentence with grammar, where each element modifies and builds upon the others.
Bash itself reveals even deeper patterns:
for file in $(find . -name "*.log" -mtime -7); do if grep -q "ERROR" "$file"; then tail -n 100 "$file" | grep -A5 -B5 "ERROR" >> errors_summary.txt fi done
This script demonstrates linguistic constructs that go beyond simple function composition. The for
loop establishes an iteration context. The if
statement creates conditional execution paths. The -A5 -B5
flags to grep create a context window around matches. The >>
operator appends to a file, maintaining state across iterations.
These aren't just utilities - they're linguistic primitives that allow us to express complex computational thoughts. The power comes from three key properties:
Composability: Each element can be combined with others in predictable ways
Parameterization: Behavior can be modified through flags and arguments
Context preservation: State and environment flow through the execution
Meanwhile, here's what every Vercel AI agent looks like after a few iterations:
const tools = { search: (query) => fetchAPI(query), analyze: (data) => processData(data), pause_for_input: () => /* undefined behavior */, delegate_to_expert: () => /* hack required */, remember_context: () => /* not possible */ };
Those last three aren't tools. They're attempted escape hatches from the framework's constraints. They represent the agent trying to express control flow concepts that don't map to simple function calls.
The Message List Architecture Problem
To understand why this matters, we need to examine the fundamental architecture of modern agent frameworks. The Vercel AI SDK, like most frameworks, treats the message list as the central abstraction. Every interaction appends to this list. Every decision is made by processing this list. The list is truth.
This works beautifully for simple conversational agents. But it breaks down catastrophically for operational agents that need to maintain state, execute complex procedures, or manage multiple contexts. After 20 messages, every Vercel AI agent begins to degrade. Not because the model is inadequate, but because it's re-processing an ever-growing context window just to maintain continuity.
The degradation follows a predictable pattern. First, the agent starts forgetting early instructions as they get pushed out of the effective context window. Then it begins confusing intermediate state with final outputs, treating diagnostic messages and tool call results as part of the conversation. Finally, it loses coherence entirely, unable to distinguish between different phases of execution.
The agent is reconstructing its entire operational state from a linear transcript on every invocation. It's computationally equivalent to a CPU that must re-read all previous instructions to execute the next one. No wonder it fails.
The message list is the source of truth - that's architecturally sound. What's unsound is that we can only append to it. We lack primitives for:
Filtering: Extracting relevant context without processing everything
Checkpointing: Saving and restoring state at specific points
Isolation: Running sub-computations without polluting the main context
Transformation: Modifying the message history for different purposes
I needed to build something specific: a customer support agent that could handle complex, multi-step debugging sessions. The requirements revealed the inadequacy of current frameworks:
Suspend and Resume: The agent must pause execution, wait for user diagnostic commands, and resume with full context
Delegation: Sub-tasks must be routed to specialized models without polluting the main conversation
Working Memory: Operational state must persist separately from the conversation transcript
Deterministic SOPs: The agent must follow exact procedures, not probabilistic responses
The Vercel AI SDK cannot express these patterns. You can attempt to simulate pausing with generateUI
, but resumption requires reconstructing the entire context. Delegation means your message history becomes a tangled mess of main agent and sub-agent conversations. Working memory must be encoded in ever-growing system prompts that eventually exceed token limits.
This is where developers typically adopt LangGraph - defining explicit state machines with nodes, edges, and transition conditions. But this requires hundreds of lines of boilerplate to express what should be simple control flow. You end up encoding your agent's logic twice: once in the graph definition and once in the node implementations.
The Cascade Pattern
The insight that led to Cascade was simple: TypeScript already encodes a computational graph. Every if
statement is a conditional edge. Every function call is a node transition. Every try/catch
block defines error boundaries. We don't need to reify this graph in a separate abstraction - we need clean injection points for agent logic.
The Cascade pattern emerged from asking: what are the natural moments in an agent's execution where we need to intervene? There are exactly three:
When the user provides input
When the AI generates a response
When a tool returns a result
These three moments form the complete lifecycle of agent interaction. Everything else is orchestration between these points. This led to the minimal interface:
class Agent { async handleUserMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } async handleAiMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } async handleToolMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } }
The beauty of this pattern is its simplicity. Each handler receives the current message list and returns a new one. No hidden state. No complex lifecycle. Just pure functions that transform messages.
But this simplicity enables sophisticated patterns that are impossible or extremely difficult with traditional frameworks. Let me demonstrate with concrete examples.
Self-Referential Reasoning
One of the most powerful patterns in human cognition is internal dialogue - the ability to reason through problems by questioning and answering ourselves. Current frameworks make this nearly impossible because every AI generation becomes part of the conversation history. The user sees the agent arguing with itself rather than receiving a coherent response.
With Cascade, we can implement true internal reasoning:
class ReflectiveAgent extends Agent { async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; if (lastMessage.content.includes("INTERNAL_REASONING_REQUIRED")) { const innerDialogue = []; // Create an isolated reasoning context for (let i = 0; i < 3; i++) { const thought = await generateText({ model: "gpt-4", messages: [ { role: "system", content: "Analyze step by step." }, ...innerDialogue ] }); innerDialogue.push(thought); // Check for convergence if (thought.content.includes("CONCLUSION:")) break; } // Synthesize the internal reasoning into a single response const synthesis = this.synthesize(innerDialogue); return { messages: [...messages, { role: "assistant", content: synthesis }] }; } return { messages }; } synthesize(thoughts) { // Extract key insights from internal dialogue const insights = thoughts.map(t => this.extractKeyPoints(t)); // Build coherent response from insights return this.buildResponse(insights); } }
The critical innovation here is that the internal dialogue never appears in the main message list. The agent can have a complex, multi-step reasoning process, potentially using different models or prompts for each step, while presenting only the final synthesized conclusion to the user. This isn't just about hiding implementation details - it's about maintaining clean separation between operational reasoning and conversational flow.
Stateful Human-in-the-Loop
The inability to properly pause and resume execution is one of the most frustrating limitations of current frameworks. When an agent needs user input mid-execution, it must either abandon its current context or attempt to encode everything in the message history, leading to token bloat and context confusion.
Cascade enables true suspension with state preservation:
class InteractiveAgent extends Agent { private suspended: Map<string, any> = new Map(); async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; if (lastMessage.content.includes("REQUIRES_USER_INPUT")) { const sessionId = crypto.randomUUID(); // Preserve only essential context, not entire history this.suspended.set(sessionId, { messages: messages.slice(0, -10), // Only preserve relevant context workingMemory: this.extractWorkingMemory(messages), nextPhase: this.determineNextPhase(lastMessage), checkpoint: Date.now() }); // Mark suspension point in message stream return { messages: [...messages, { role: "system", content: `SUSPENDED:${sessionId}:${this.generateResumptionToken()}` }] }; } return { messages }; } async handleUserMessage(messages) { const systemMessage = messages.findLast(m => m.role === "system"); if (systemMessage?.content.startsWith("SUSPENDED:")) { const [_, sessionId, token] = systemMessage.content.split(":"); const suspended = this.suspended.get(sessionId); // Validate and restore context if (this.validateToken(token, suspended.checkpoint)) { const resumedMessages = [ ...suspended.messages, { role: "system", content: `RESUMED with context: ${suspended.workingMemory}` }, messages[messages.length - 1] // User's new input ]; this.suspended.delete(sessionId); return { messages: resumedMessages }; } } return { messages }; } }
This pattern solves multiple problems simultaneously. The agent can maintain working memory across suspension points without encoding it in prompts. It can preserve just the relevant context rather than the entire history. It can validate resumption to prevent context injection attacks. Most importantly, it can resume execution exactly where it left off, with full state restoration.
Control Flow Through Tool Results
In traditional frameworks, tool calls are side effects. They fetch data or perform actions, but they don't fundamentally alter the agent's execution path. This limitation forces complex control flow into prompt engineering, making agents brittle and unpredictable.
Cascade treats tool results as first-class control flow primitives:
class ControlFlowAgent extends Agent { async handleToolMessage(messages) { const toolResult = messages[messages.length - 1]; if (toolResult.name === "complexity_analysis") { const analysis = JSON.parse(toolResult.content); // Tool result determines execution path if (analysis.complexity_score > 0.8) { // Fork to specialized execution path const specializedContext = { model: "claude-3-opus", temperature: 0.1, systemPrompt: this.generateSpecialistPrompt(analysis), maxTokens: 4000 }; const expertResult = await this.executeInContext( specializedContext, analysis.decomposed_problem ); return { messages: [...messages, { role: "assistant", content: this.formatExpertResponse(expertResult) }] }; } // Continue normal execution path return { messages }; } return { messages }; } async executeInContext(context, task) { // Isolated execution environment const isolatedMessages = [ { role: "system", content: context.systemPrompt }, { role: "user", content: JSON.stringify(task) } ]; return await generateText({ model: context.model, messages: isolatedMessages, temperature: context.temperature, maxTokens: context.maxTokens }); } }
The key insight is that tool results can trigger entirely different execution contexts. A complexity analysis might route to a specialist model. A permission check might enforce access controls. A resource check might switch to a more efficient model. The tool isn't just providing data - it's directing the computation itself.
The Language Problem
Even with Cascade providing clean control flow, agents still struggle to express complex operational patterns. The issue isn't just about control flow - it's about the poverty of expression available to agents trying to communicate their computational intent.
Consider what happens when an agent needs to describe a multi-step data pipeline with error handling and conditional logic:
{ "tool": "complex_workflow", "parameters": { "steps": [ { "type": "parallel", "branches": [ { "id": "branch_1", "operations": [ { "op": "fetch", "source": "api", "retry": { "attempts": 3, "backoff": "exponential", "initial_delay": 100 } }, { "op": "transform", "schema": { "type": "object", "properties": { "nested": { "type": "object", "properties": { "deeply": { "type": "string" } } } } } } ] } ] } ] } }
This JSON structure is attempting to encode program semantics - parallelism, sequencing, error handling, data transformation. But JSON wasn't designed for this. It lacks:
Variables and references: No way to refer to earlier results
Composition: Can't build complex operations from simpler ones
Abstraction: Can't define reusable patterns
Conditionals: No native way to express if-then-else logic
LLMs excel at generating code because they've been trained on millions of examples of function composition, variable references, and control flow. When we force them to encode these patterns in rigid JSON schemas, we're working against their strengths.
XJSN: Extensible JavaScript Notation
The solution emerged from studying successful data notations that bridge the gap between human-readable and machine-parseable. Clojure's EDN (Extensible Data Notation) provided key inspiration - it looks like code but is pure data. Clojure.spec showed how to add schemas and validation to such structures.
XJSN operates on three fundamental principles:
1. Syntactic Familiarity, Semantic Safety
The notation uses JavaScript's function call syntax, but parses to pure data structures. This isn't about enabling code execution - it's about leveraging syntactic patterns that LLMs generate reliably.
// This looks like JavaScript Transform({ input: Select({ from: "users", where: { active: true } }), apply: [ Normalize({ field: "email", method: "lowercase" }), Validate({ field: "age", constraint: GreaterThan(18) }) ] }) // But parses to this AST { type: "FunctionCall", name: "Transform", args: { input: { type: "FunctionCall", name: "Select", args: { from: "users", where: { active: true } } }, apply: [ { type: "FunctionCall", name: "Normalize", args: {...} }, { type: "FunctionCall", name: "Validate", args: {...} } ] } }
The key insight is that we're not evaluating these expressions - we're interpreting them. The runtime decides what Transform
, Select
, and Normalize
mean in its specific context. This gives us safety (no arbitrary code execution) while preserving expressiveness (nested function composition).
2. Leveraging the LLM's Training Distribution
Models have seen millions of examples of JavaScript, Python, and similar languages. They understand function calls, method chaining, and nested expressions at a deep level. By using familiar syntax, we tap into this learned knowledge.
Consider how naturally an LLM can generate:
Pipeline([ Filter({ status: "active" }), Map({ extract: ["id", "name", "email"] }), GroupBy({ field: "department" }), Aggregate({ count: Count(), average_age: Average("age") }) ])
This flows naturally because it mirrors patterns the model has seen thousands of times in training data. Compare this to the equivalent JSON schema - the model must fight against its training to produce the rigid structure.
3. Semantic Extensibility Through Interpretation
The power of XJSN comes from the separation between syntax and semantics. The same syntactic structure can mean different things in different contexts. This is inspired by Lisp's macro system, where code structure is just data until interpretation gives it meaning.
A single XJSN expression like Retry({ attempts: 3 })
might mean:
In a network context: retry failed HTTP requests
In a database context: retry deadlocked transactions
In an AI context: regenerate responses that fail validation
In a UI context: re-render components that error
The interpreter provides the semantic layer. This allows domain-specific languages to emerge naturally from the same syntactic foundation.
XJSN in Practice: Domain Languages
Let's explore how XJSN enables rich domain-specific languages across different problem spaces. Each example demonstrates how complex operational patterns become natural expressions.
Business Process Expression
Business logic often involves hierarchical rules, conditional escalations, and complex aggregations. Traditional approaches require either rigid workflow engines or extensive programming. XJSN provides a middle ground:
AuditProcess({ scope: RecursiveWalk({ root: EntityGraph("subsidiaries"), traversal: BreadthFirst({ filter: And([ Revenue(GreaterThan(10000000)), Jurisdiction(In(["EU", "US", "UK"])), LastAudit(OlderThan(Days(180))) ]), depth_limit: 3 }), accumulator: RiskProfile({ calculate: WeightedSum({ revenue_exposure: 0.3, regulatory_complexity: 0.4, time_since_audit: 0.3 }), normalize: Percentile({ distribution: HistoricalRisks("2020-2024"), method: "empirical" }) }) }), decision: Switch({ cases: [ { when: Above(Percentile(95)), then: Escalate({ to: ["CFO", "Board"], sla: Hours(4), template: "critical_risk" }) }, { when: Above(Percentile(80)), then: Schedule({ review: "quarterly", owner: "compliance_team", priority: "high" }) }, { default: Archive({ retention: Years(7), location: "cold_storage" }) } ] }) })
This expression captures a complete audit process that would typically require hundreds of lines of imperative code. The RecursiveWalk
naturally expresses graph traversal with filtering. The WeightedSum
makes the risk calculation explicit and auditable. The Switch
statement clearly defines escalation thresholds.
The power comes from composition. Each function is simple - GreaterThan
, In
, Days
- but they compose into sophisticated business logic. An AI agent can generate this by understanding the business requirements, not by navigating complex API documentation.
Workflow Orchestration
Modern data pipelines require parallelism, error handling, and complex transformations. XJSN expresses these patterns naturally:
Pipeline({ initialize: Transaction({ isolation: "repeatable_read", timeout: Seconds(30) }), stages: [ ParallelMap({ over: DataStream("input_events"), concurrency: 10, worker: Lambda(["event"], Try({ body: Sequence([ Validate({ schema: EventSchema, mode: "strict", on_error: "reject" }), Enrich({ join: LeftOuter({ with: "reference_data", on: ["event.id", "reference.event_id"], select: ["metadata", "category", "priority"] }) }), Transform({ apply: [ { field: "timestamp", fn: ToISO8601() }, { field: "amount", fn: Normalize({ currency: "USD" }) } ] }) ]), catch: ErrorHandler({ retry: ExponentialBackoff({ attempts: 3, initial: Milliseconds(100), max: Seconds(5), jitter: true }), fallback: DeadLetter({ queue: "failed_events", include_context: true }) }) }) ) }), Aggregate({ window: Tumbling({ size: Minutes(5) }), group_by: ["category", "priority"], compute: { count: Count(), sum: Sum("amount"), p95: Percentile(95, "processing_time"), anomalies: DetectAnomalies({ method: "isolation_forest", contamination: 0.01 }) } }), Sink({ destination: When([ { condition: HasAnomalies(), target: AlertingService({ severity: "high", channels: ["pagerduty", "slack"] }) }, { default: DataLake({ format: "parquet", partition: "date", compression: "snappy" }) } ]) }) ], on_failure: CompensatingAction({ rollback: true, notify: ["ops_team", "on_call"], preserve: ["audit_log", "error_context"] }) })
This pipeline specification would be nearly impossible to express in JSON without massive nesting and repetition. The XJSN version reads like a high-level description of the data flow. The Lambda
function creates inline workers. The Try/Catch
pattern handles errors at the appropriate level. The ExponentialBackoff
clearly expresses retry logic.
Notice how naturally this composes. The ParallelMap
contains a Lambda
which contains a Try
which contains a Sequence
. Each level adds semantic meaning without syntactic overhead.
Cognitive Strategies
Perhaps most interestingly, XJSN can express reasoning strategies - allowing agents to describe not just what to think, but how to think:
ReasoningFramework({ strategy: AdaptiveChain({ initial: AssessComplexity({ dimensions: ["logical", "computational", "knowledge", "creative"], method: "ensemble" }), router: Lambda(["assessment"], Match({ pattern: assessment.profile, cases: [ { when: { logical: High(), computational: Low() }, then: SymbolicReasoning({ method: "natural_deduction", rules: LoadRuleset("domain_specific"), max_depth: 10, proof_strategy: "backward_chaining" }) }, { when: { computational: High() }, then: NumericSolver({ approach: Select({ linear: "simplex", nonlinear: "newton_raphson", discrete: "branch_and_bound", stochastic: "monte_carlo" }), precision: Digits(6), timeout: Seconds(5), parallel: true }) }, { when: { knowledge: High(), logical: Medium() }, then: HybridRetrieval({ semantic: VectorSearch({ index: "knowledge_base", top_k: 20, rerank: CrossEncoder() }), structured: GraphQuery({ traverse: "knowledge_graph", max_hops: 3, aggregate: "path_relevance" }), synthesis: WeightedMerge({ semantic: 0.6, structured: 0.4, validation: "cross_reference" }) }) }, { default: GeneralProblemSolver({ decompose: RecursiveDecomposition({ method: "functional", min_chunk: "atomic_operation", max_depth: 5 }), solve: MapReduce({ mapper: Ref("router"), // Recursive reference reducer: SynthesizeResults({ consistency_check: true, confidence_threshold: 0.8 }) }) }) } ] }) ), termination: FirstTrue([ SolutionFound({ confidence: GreaterThan(0.95) }), ResourceLimit({ time: Seconds(30), memory: Gigabytes(1) }), UserInterrupt() ]), trace: AuditLog({ level: "detailed", include: ["decisions", "backtracking", "resource_usage"] }) }) })
This framework describes an entire reasoning architecture. The agent assesses problem complexity across multiple dimensions, routes to specialized reasoning methods based on the assessment, and maintains termination conditions. The Ref("router")
creates a recursive structure where the problem solver can decompose problems and route sub-problems back through the same framework.
This isn't just configuration - it's a complete computational strategy expressed as data. An AI agent can generate variations of this framework, adapting its reasoning approach to different domains or requirements.
Synthesis: Cascade + XJSN
The true power emerges from combining Cascade's control flow with XJSN's expressiveness. Together, they create a system where agents can express and execute sophisticated operational patterns:
class XJSNAgent extends Agent { private interpreter = new XJSNInterpreter(); async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; // Parse the AI's XJSN expression const thought = this.interpreter.parse(lastMessage.content); // Route based on expression type if (thought.type === "Orchestration") { return await this.executeOrchestration(thought, messages); } if (thought.type === "Reasoning") { return await this.executeReasoning(thought, messages); } if (thought.type === "Control") { return await this.executeControl(thought, messages); } // Default pass-through for standard responses return { messages }; } async executeOrchestration(orchestration, messages) { const { pipeline } = orchestration; // Set up transactional context if specified const txContext = pipeline.initialize?.type === "Transaction" ? await this.beginTransaction(pipeline.initialize) : null; // Execute pipeline stages with proper isolation const stages = pipeline.stages.map(stage => ({ execute: this.compileStage(stage), isolation: stage.transaction?.isolation || "read_committed", timeout: stage.transaction?.timeout || 30000 })); const results = []; for (const stage of stages) { try { const stageResult = await this.runIsolated(stage, messages); results.push(stageResult); // Check for early termination conditions if (this.shouldTerminate(stageResult, pipeline)) { break; } } catch (error) { if (pipeline.on_failure) { return this.executeCompensation( pipeline.on_failure, results, messages, error ); } throw error; } } // Commit transaction if active if (txContext) { await this.commitTransaction(txContext); } return { messages: [...messages, { role: "assistant", content: this.formatPipelineResult(results) }] }; } async executeReasoning(reasoning, messages) { const { strategy } = reasoning; const context = this.extractContext(messages); // Assess problem complexity using the specified method const assessment = await this.assess(strategy.initial, context); // Route to appropriate reasoning method based on assessment const method = await this.evaluateRouter(strategy.router, assessment); // Set up execution environment with tracing if requested const executionEnv = { method, context, trace: strategy.trace ? this.createTracer(strategy.trace) : null }; // Execute with termination conditions const result = await this.executeWithTermination( executionEnv, strategy.termination ); return { messages: [...messages, { role: "assistant", content: result.solution, metadata: { reasoning_trace: result.trace, confidence: result.confidence, resources_used: result.resources } }] }; } async executeControl(control, messages) { const { operation } = control; switch (operation.type) { case "Fork": // Create parallel execution branches const branches = await Promise.all( operation.branches.map(branch => this.executeBranch(branch, messages) ) ); return this.mergeBranches(branches, operation.merge_strategy); case "Checkpoint": // Save current state for potential rollback await this.saveCheckpoint(messages, operation.label); return { messages }; case "Rollback": // Restore from checkpoint const checkpoint = await this.loadCheckpoint(operation.to); return { messages: checkpoint.messages }; case "Switch": // Change execution context return this.switchContext(operation.context, messages); default: return { messages }; } } }
The agent is no longer just responding to messages - it's executing sophisticated computational patterns expressed in XJSN. The interpreter transforms XJSN expressions into executable operations. The Cascade handlers provide the control flow. Together, they create a system where agents can express and execute complex operational logic.
The Broader Implications
This architecture fundamentally changes how we think about agent development. Instead of asking "what tools does my agent need?", we ask "what language should my agent think in?"
For a compliance agent, that language includes predicates like RegulatoryRequirement
, AuditThreshold
, and EscalationPath
. For a code review agent, it includes PatternMatch
, ComplexityMetric
, and RefactoringStrategy
. For a research agent, it includes HypothesisFormulation
, EvidenceGathering
, and SynthesisStrategy
.
Each domain gets a vocabulary that maps naturally to its operational patterns. The vocabulary isn't just naming - it's computational. Each term has semantic meaning that translates to specific execution patterns.
This approach also solves the perennial problem of agent reliability. When agents express their intent in XJSN, we can validate it before execution. We can add guards and constraints. We can implement rollback and compensation. The non-determinism of LLM generation is separated from the determinism of execution.
The message list remains the canonical source of truth, but we now have sophisticated primitives for managing it. We can checkpoint and restore. We can filter and transform. We can isolate sub-computations. The message list becomes a transaction log rather than a monolithic context.
TypeScript provides our execution environment, but agents can express thoughts that compile to complex operations. The implicit graph of TypeScript control flow replaces the explicit graph of LangGraph. The expressiveness of XJSN replaces the rigidity of JSON tool calls.
Implementation Considerations
Building this system requires careful attention to several technical challenges:
Parsing and Validation
The XJSN parser must be robust enough to handle the imperfect output of LLMs while strict enough to prevent security issues. The parser operates in three phases:
Tokenization: Breaking the input into syntactic elements
AST Construction: Building the tree structure
Validation: Ensuring semantic correctness
The validation phase is critical. It must verify that:
All referenced functions exist in the current context
Arguments match expected types
No infinite recursion is possible
Resource limits are respected
Interpretation and Execution
The interpreter must map XJSN expressions to concrete operations. This requires a registry of available functions and their implementations. The interpreter should support:
Lazy evaluation for efficiency
Partial evaluation for debugging
Streaming execution for long-running operations
Rollback for error recovery
State Management
The Cascade runtime must manage state across handler invocations. This includes:
Working memory that persists between messages
Checkpoints for rollback capability
Isolation contexts for sub-computations
Resource tracking for cost management
Error Handling
Errors can occur at multiple levels:
Parse errors in XJSN expressions
Validation errors in arguments
Runtime errors during execution
Resource exhaustion
Each level needs appropriate handling, from gentle correction (helping the AI fix malformed XJSN) to hard stops (preventing resource exhaustion).
Future Directions
This work opens several avenues for exploration:
Standard Library Development
Different domains need different XJSN vocabularies. We could develop standard libraries for common patterns:
Business process management
Data pipeline orchestration
UI generation and interaction
Scientific computation
Creative content generation
Visual Programming Integration
XJSN's tree structure maps naturally to visual programming. We could build visual editors that generate XJSN, allowing non-programmers to design agent behaviors.
Formal Verification
Because XJSN expressions are data, we can analyze them formally. We could prove properties like termination, resource bounds, or semantic correctness.
Cross-Agent Communication
Agents could communicate by exchanging XJSN expressions, creating a lingua franca for agent interaction. This would enable sophisticated multi-agent systems without tight coupling.
Conclusion
The question isn't whether your agent can call tools. It's whether your agent can think in a language expressive enough for its domain.
Current frameworks force agents to express complex operational logic through simple tool calls - like writing poetry with emoji. The result is fragile, hard to debug, and impossible to scale beyond toy examples.
The Cascade pattern provides clean control flow without the complexity of state machines. XJSN provides expressiveness without the dangers of arbitrary code execution. Together, they enable agents that can express and execute sophisticated operational patterns.
This isn't about building better agents. It's about giving agents better languages to think in. When we do that, complex patterns that previously required extensive framework machinery become simple expressions of intent.
The implementation is available at [github.com/...]. The XJSN parser handles recursive expressions, lambda functions, and conditional evaluation. The Cascade runtime provides hooks for all three message types with proper async handling and error boundaries.
The future of agent development isn't just about better models or more tools. It's about creating languages that allow agents to express the full richness of computational thought. Only then can we build agents that truly augment human capability rather than merely simulating it.
Consider what makes grep
powerful. It's not just a search function - it's a language:
grep -r -l "pattern" . | xargs grep -c "pattern" | sort -t: -k2 -n
This pipeline recursively finds files containing a pattern, counts occurrences in each, and sorts by frequency. But here's what makes this fundamentally different from tool calling: the entire computational graph is expressed upfront. The shell doesn't execute grep
, wait for results, append them to a history, re-read everything to decide what to do next, then call xargs
. The pipeline is a complete thought - data flows through transformations without intermediary decision points.
Contrast this with how tool calling works in current agent frameworks:
This is maximally inefficient. Every tool call requires a full model invocation just to decide what to do with the result. The agent can't express "search for X, then filter by Y, then summarize" as a single computational thought. It must interleave execution with decision-making, rebuilding its entire context from scratch at each step.
The message list architecture makes sense for one thing: when you genuinely need the agent to make a new decision based on intermediate results. But that's maybe 20% of cases. The other 80%, the agent already knows the flow it wants: "validate this, transform that, aggregate results." Yet it's forced to pretend each step is a surprise requiring deep deliberation.
Worse, agents can't pre-express conditional logic. In bash, you write:
if grep -q "ERROR" logfile; then tail -n 100 logfile | mail -s "Error detected" admin@example.com fi
The entire decision tree is declared upfront. With tool calling, the agent must:
Call check_for_errors tool
Wait for result to append to messages
Re-process entire conversation
Decide whether to send email
Call send_email tool if needed
Five model invocations for what should be a single conditional expression. The agent has no way to say "if error, then email" as a complete thought. It must perform this logic through the message list, like a CPU that forgets its instruction pointer after every operation.
This is why coding agents work so well. When an agent writes code, it's not just calling functions - it's expressing complete computational graphs. It can declare entire flows: loops, conditionals, error handling, state management. The code is the thought, fully formed, not scattered across message history.
But what happens when we ask agents to orchestrate business processes, data pipelines, or customer workflows? We force them to express everything through atomic tool calls, each one requiring the entire conversational context to be reprocessed just to make the next micro-decision.
The solution isn't to make all agents code in Python. Business logic shouldn't be expressed in programming languages any more than it should be expressed in individual tool calls. What we need are domain-specific languages that let agents express complete computational thoughts - entire flows, decision trees, and transformations - in concepts native to their problem space.
This post introduces two patterns that enable this: Cascade, which provides control flow without state machine complexity, and XJSN, which lets agents express domain-specific computational thought as structured data. Together, they allow you to design languages where agents can think in complete computational graphs rather than isolated tool calls.
The Power of Domain-Specific Languages
Consider how we actually work with powerful tools. grep
isn't just a function - it's a language with rich compositional semantics:
grep -r -l "pattern" . | xargs grep -c "pattern" | sort -t: -k2 -n
This pipeline recursively finds files containing a pattern, counts occurrences in each, and sorts by frequency. But look closer at what makes this powerful. It's not the individual commands - it's the linguistic structure that emerges from their composition.
The -r
flag doesn't just set a boolean; it fundamentally changes the search space from a file to a directory tree. The -l
flag transforms the output type from matched lines to filenames. The pipe operator (|
) creates a compositional chain where each command's output becomes the next command's input. This isn't three function calls - it's a sentence with grammar, where each element modifies and builds upon the others.
Bash itself reveals even deeper patterns:
for file in $(find . -name "*.log" -mtime -7); do if grep -q "ERROR" "$file"; then tail -n 100 "$file" | grep -A5 -B5 "ERROR" >> errors_summary.txt fi done
This script demonstrates linguistic constructs that go beyond simple function composition. The for
loop establishes an iteration context. The if
statement creates conditional execution paths. The -A5 -B5
flags to grep create a context window around matches. The >>
operator appends to a file, maintaining state across iterations.
These aren't just utilities - they're linguistic primitives that allow us to express complex computational thoughts. The power comes from three key properties:
Composability: Each element can be combined with others in predictable ways
Parameterization: Behavior can be modified through flags and arguments
Context preservation: State and environment flow through the execution
Meanwhile, here's what every Vercel AI agent looks like after a few iterations:
const tools = { search: (query) => fetchAPI(query), analyze: (data) => processData(data), pause_for_input: () => /* undefined behavior */, delegate_to_expert: () => /* hack required */, remember_context: () => /* not possible */ };
Those last three aren't tools. They're attempted escape hatches from the framework's constraints. They represent the agent trying to express control flow concepts that don't map to simple function calls.
The Message List Architecture Problem
To understand why this matters, we need to examine the fundamental architecture of modern agent frameworks. The Vercel AI SDK, like most frameworks, treats the message list as the central abstraction. Every interaction appends to this list. Every decision is made by processing this list. The list is truth.
This works beautifully for simple conversational agents. But it breaks down catastrophically for operational agents that need to maintain state, execute complex procedures, or manage multiple contexts. After 20 messages, every Vercel AI agent begins to degrade. Not because the model is inadequate, but because it's re-processing an ever-growing context window just to maintain continuity.
The degradation follows a predictable pattern. First, the agent starts forgetting early instructions as they get pushed out of the effective context window. Then it begins confusing intermediate state with final outputs, treating diagnostic messages and tool call results as part of the conversation. Finally, it loses coherence entirely, unable to distinguish between different phases of execution.
The agent is reconstructing its entire operational state from a linear transcript on every invocation. It's computationally equivalent to a CPU that must re-read all previous instructions to execute the next one. No wonder it fails.
The message list is the source of truth - that's architecturally sound. What's unsound is that we can only append to it. We lack primitives for:
Filtering: Extracting relevant context without processing everything
Checkpointing: Saving and restoring state at specific points
Isolation: Running sub-computations without polluting the main context
Transformation: Modifying the message history for different purposes
I needed to build something specific: a customer support agent that could handle complex, multi-step debugging sessions. The requirements revealed the inadequacy of current frameworks:
Suspend and Resume: The agent must pause execution, wait for user diagnostic commands, and resume with full context
Delegation: Sub-tasks must be routed to specialized models without polluting the main conversation
Working Memory: Operational state must persist separately from the conversation transcript
Deterministic SOPs: The agent must follow exact procedures, not probabilistic responses
The Vercel AI SDK cannot express these patterns. You can attempt to simulate pausing with generateUI
, but resumption requires reconstructing the entire context. Delegation means your message history becomes a tangled mess of main agent and sub-agent conversations. Working memory must be encoded in ever-growing system prompts that eventually exceed token limits.
This is where developers typically adopt LangGraph - defining explicit state machines with nodes, edges, and transition conditions. But this requires hundreds of lines of boilerplate to express what should be simple control flow. You end up encoding your agent's logic twice: once in the graph definition and once in the node implementations.
The Cascade Pattern
The insight that led to Cascade was simple: TypeScript already encodes a computational graph. Every if
statement is a conditional edge. Every function call is a node transition. Every try/catch
block defines error boundaries. We don't need to reify this graph in a separate abstraction - we need clean injection points for agent logic.
The Cascade pattern emerged from asking: what are the natural moments in an agent's execution where we need to intervene? There are exactly three:
When the user provides input
When the AI generates a response
When a tool returns a result
These three moments form the complete lifecycle of agent interaction. Everything else is orchestration between these points. This led to the minimal interface:
class Agent { async handleUserMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } async handleAiMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } async handleToolMessage(messages: Message[]): Promise<{messages: Message[]}> { return { messages }; } }
The beauty of this pattern is its simplicity. Each handler receives the current message list and returns a new one. No hidden state. No complex lifecycle. Just pure functions that transform messages.
But this simplicity enables sophisticated patterns that are impossible or extremely difficult with traditional frameworks. Let me demonstrate with concrete examples.
Self-Referential Reasoning
One of the most powerful patterns in human cognition is internal dialogue - the ability to reason through problems by questioning and answering ourselves. Current frameworks make this nearly impossible because every AI generation becomes part of the conversation history. The user sees the agent arguing with itself rather than receiving a coherent response.
With Cascade, we can implement true internal reasoning:
class ReflectiveAgent extends Agent { async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; if (lastMessage.content.includes("INTERNAL_REASONING_REQUIRED")) { const innerDialogue = []; // Create an isolated reasoning context for (let i = 0; i < 3; i++) { const thought = await generateText({ model: "gpt-4", messages: [ { role: "system", content: "Analyze step by step." }, ...innerDialogue ] }); innerDialogue.push(thought); // Check for convergence if (thought.content.includes("CONCLUSION:")) break; } // Synthesize the internal reasoning into a single response const synthesis = this.synthesize(innerDialogue); return { messages: [...messages, { role: "assistant", content: synthesis }] }; } return { messages }; } synthesize(thoughts) { // Extract key insights from internal dialogue const insights = thoughts.map(t => this.extractKeyPoints(t)); // Build coherent response from insights return this.buildResponse(insights); } }
The critical innovation here is that the internal dialogue never appears in the main message list. The agent can have a complex, multi-step reasoning process, potentially using different models or prompts for each step, while presenting only the final synthesized conclusion to the user. This isn't just about hiding implementation details - it's about maintaining clean separation between operational reasoning and conversational flow.
Stateful Human-in-the-Loop
The inability to properly pause and resume execution is one of the most frustrating limitations of current frameworks. When an agent needs user input mid-execution, it must either abandon its current context or attempt to encode everything in the message history, leading to token bloat and context confusion.
Cascade enables true suspension with state preservation:
class InteractiveAgent extends Agent { private suspended: Map<string, any> = new Map(); async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; if (lastMessage.content.includes("REQUIRES_USER_INPUT")) { const sessionId = crypto.randomUUID(); // Preserve only essential context, not entire history this.suspended.set(sessionId, { messages: messages.slice(0, -10), // Only preserve relevant context workingMemory: this.extractWorkingMemory(messages), nextPhase: this.determineNextPhase(lastMessage), checkpoint: Date.now() }); // Mark suspension point in message stream return { messages: [...messages, { role: "system", content: `SUSPENDED:${sessionId}:${this.generateResumptionToken()}` }] }; } return { messages }; } async handleUserMessage(messages) { const systemMessage = messages.findLast(m => m.role === "system"); if (systemMessage?.content.startsWith("SUSPENDED:")) { const [_, sessionId, token] = systemMessage.content.split(":"); const suspended = this.suspended.get(sessionId); // Validate and restore context if (this.validateToken(token, suspended.checkpoint)) { const resumedMessages = [ ...suspended.messages, { role: "system", content: `RESUMED with context: ${suspended.workingMemory}` }, messages[messages.length - 1] // User's new input ]; this.suspended.delete(sessionId); return { messages: resumedMessages }; } } return { messages }; } }
This pattern solves multiple problems simultaneously. The agent can maintain working memory across suspension points without encoding it in prompts. It can preserve just the relevant context rather than the entire history. It can validate resumption to prevent context injection attacks. Most importantly, it can resume execution exactly where it left off, with full state restoration.
Control Flow Through Tool Results
In traditional frameworks, tool calls are side effects. They fetch data or perform actions, but they don't fundamentally alter the agent's execution path. This limitation forces complex control flow into prompt engineering, making agents brittle and unpredictable.
Cascade treats tool results as first-class control flow primitives:
class ControlFlowAgent extends Agent { async handleToolMessage(messages) { const toolResult = messages[messages.length - 1]; if (toolResult.name === "complexity_analysis") { const analysis = JSON.parse(toolResult.content); // Tool result determines execution path if (analysis.complexity_score > 0.8) { // Fork to specialized execution path const specializedContext = { model: "claude-3-opus", temperature: 0.1, systemPrompt: this.generateSpecialistPrompt(analysis), maxTokens: 4000 }; const expertResult = await this.executeInContext( specializedContext, analysis.decomposed_problem ); return { messages: [...messages, { role: "assistant", content: this.formatExpertResponse(expertResult) }] }; } // Continue normal execution path return { messages }; } return { messages }; } async executeInContext(context, task) { // Isolated execution environment const isolatedMessages = [ { role: "system", content: context.systemPrompt }, { role: "user", content: JSON.stringify(task) } ]; return await generateText({ model: context.model, messages: isolatedMessages, temperature: context.temperature, maxTokens: context.maxTokens }); } }
The key insight is that tool results can trigger entirely different execution contexts. A complexity analysis might route to a specialist model. A permission check might enforce access controls. A resource check might switch to a more efficient model. The tool isn't just providing data - it's directing the computation itself.
The Language Problem
Even with Cascade providing clean control flow, agents still struggle to express complex operational patterns. The issue isn't just about control flow - it's about the poverty of expression available to agents trying to communicate their computational intent.
Consider what happens when an agent needs to describe a multi-step data pipeline with error handling and conditional logic:
{ "tool": "complex_workflow", "parameters": { "steps": [ { "type": "parallel", "branches": [ { "id": "branch_1", "operations": [ { "op": "fetch", "source": "api", "retry": { "attempts": 3, "backoff": "exponential", "initial_delay": 100 } }, { "op": "transform", "schema": { "type": "object", "properties": { "nested": { "type": "object", "properties": { "deeply": { "type": "string" } } } } } } ] } ] } ] } }
This JSON structure is attempting to encode program semantics - parallelism, sequencing, error handling, data transformation. But JSON wasn't designed for this. It lacks:
Variables and references: No way to refer to earlier results
Composition: Can't build complex operations from simpler ones
Abstraction: Can't define reusable patterns
Conditionals: No native way to express if-then-else logic
LLMs excel at generating code because they've been trained on millions of examples of function composition, variable references, and control flow. When we force them to encode these patterns in rigid JSON schemas, we're working against their strengths.
XJSN: Extensible JavaScript Notation
The solution emerged from studying successful data notations that bridge the gap between human-readable and machine-parseable. Clojure's EDN (Extensible Data Notation) provided key inspiration - it looks like code but is pure data. Clojure.spec showed how to add schemas and validation to such structures.
XJSN operates on three fundamental principles:
1. Syntactic Familiarity, Semantic Safety
The notation uses JavaScript's function call syntax, but parses to pure data structures. This isn't about enabling code execution - it's about leveraging syntactic patterns that LLMs generate reliably.
// This looks like JavaScript Transform({ input: Select({ from: "users", where: { active: true } }), apply: [ Normalize({ field: "email", method: "lowercase" }), Validate({ field: "age", constraint: GreaterThan(18) }) ] }) // But parses to this AST { type: "FunctionCall", name: "Transform", args: { input: { type: "FunctionCall", name: "Select", args: { from: "users", where: { active: true } } }, apply: [ { type: "FunctionCall", name: "Normalize", args: {...} }, { type: "FunctionCall", name: "Validate", args: {...} } ] } }
The key insight is that we're not evaluating these expressions - we're interpreting them. The runtime decides what Transform
, Select
, and Normalize
mean in its specific context. This gives us safety (no arbitrary code execution) while preserving expressiveness (nested function composition).
2. Leveraging the LLM's Training Distribution
Models have seen millions of examples of JavaScript, Python, and similar languages. They understand function calls, method chaining, and nested expressions at a deep level. By using familiar syntax, we tap into this learned knowledge.
Consider how naturally an LLM can generate:
Pipeline([ Filter({ status: "active" }), Map({ extract: ["id", "name", "email"] }), GroupBy({ field: "department" }), Aggregate({ count: Count(), average_age: Average("age") }) ])
This flows naturally because it mirrors patterns the model has seen thousands of times in training data. Compare this to the equivalent JSON schema - the model must fight against its training to produce the rigid structure.
3. Semantic Extensibility Through Interpretation
The power of XJSN comes from the separation between syntax and semantics. The same syntactic structure can mean different things in different contexts. This is inspired by Lisp's macro system, where code structure is just data until interpretation gives it meaning.
A single XJSN expression like Retry({ attempts: 3 })
might mean:
In a network context: retry failed HTTP requests
In a database context: retry deadlocked transactions
In an AI context: regenerate responses that fail validation
In a UI context: re-render components that error
The interpreter provides the semantic layer. This allows domain-specific languages to emerge naturally from the same syntactic foundation.
XJSN in Practice: Domain Languages
Let's explore how XJSN enables rich domain-specific languages across different problem spaces. Each example demonstrates how complex operational patterns become natural expressions.
Business Process Expression
Business logic often involves hierarchical rules, conditional escalations, and complex aggregations. Traditional approaches require either rigid workflow engines or extensive programming. XJSN provides a middle ground:
AuditProcess({ scope: RecursiveWalk({ root: EntityGraph("subsidiaries"), traversal: BreadthFirst({ filter: And([ Revenue(GreaterThan(10000000)), Jurisdiction(In(["EU", "US", "UK"])), LastAudit(OlderThan(Days(180))) ]), depth_limit: 3 }), accumulator: RiskProfile({ calculate: WeightedSum({ revenue_exposure: 0.3, regulatory_complexity: 0.4, time_since_audit: 0.3 }), normalize: Percentile({ distribution: HistoricalRisks("2020-2024"), method: "empirical" }) }) }), decision: Switch({ cases: [ { when: Above(Percentile(95)), then: Escalate({ to: ["CFO", "Board"], sla: Hours(4), template: "critical_risk" }) }, { when: Above(Percentile(80)), then: Schedule({ review: "quarterly", owner: "compliance_team", priority: "high" }) }, { default: Archive({ retention: Years(7), location: "cold_storage" }) } ] }) })
This expression captures a complete audit process that would typically require hundreds of lines of imperative code. The RecursiveWalk
naturally expresses graph traversal with filtering. The WeightedSum
makes the risk calculation explicit and auditable. The Switch
statement clearly defines escalation thresholds.
The power comes from composition. Each function is simple - GreaterThan
, In
, Days
- but they compose into sophisticated business logic. An AI agent can generate this by understanding the business requirements, not by navigating complex API documentation.
Workflow Orchestration
Modern data pipelines require parallelism, error handling, and complex transformations. XJSN expresses these patterns naturally:
Pipeline({ initialize: Transaction({ isolation: "repeatable_read", timeout: Seconds(30) }), stages: [ ParallelMap({ over: DataStream("input_events"), concurrency: 10, worker: Lambda(["event"], Try({ body: Sequence([ Validate({ schema: EventSchema, mode: "strict", on_error: "reject" }), Enrich({ join: LeftOuter({ with: "reference_data", on: ["event.id", "reference.event_id"], select: ["metadata", "category", "priority"] }) }), Transform({ apply: [ { field: "timestamp", fn: ToISO8601() }, { field: "amount", fn: Normalize({ currency: "USD" }) } ] }) ]), catch: ErrorHandler({ retry: ExponentialBackoff({ attempts: 3, initial: Milliseconds(100), max: Seconds(5), jitter: true }), fallback: DeadLetter({ queue: "failed_events", include_context: true }) }) }) ) }), Aggregate({ window: Tumbling({ size: Minutes(5) }), group_by: ["category", "priority"], compute: { count: Count(), sum: Sum("amount"), p95: Percentile(95, "processing_time"), anomalies: DetectAnomalies({ method: "isolation_forest", contamination: 0.01 }) } }), Sink({ destination: When([ { condition: HasAnomalies(), target: AlertingService({ severity: "high", channels: ["pagerduty", "slack"] }) }, { default: DataLake({ format: "parquet", partition: "date", compression: "snappy" }) } ]) }) ], on_failure: CompensatingAction({ rollback: true, notify: ["ops_team", "on_call"], preserve: ["audit_log", "error_context"] }) })
This pipeline specification would be nearly impossible to express in JSON without massive nesting and repetition. The XJSN version reads like a high-level description of the data flow. The Lambda
function creates inline workers. The Try/Catch
pattern handles errors at the appropriate level. The ExponentialBackoff
clearly expresses retry logic.
Notice how naturally this composes. The ParallelMap
contains a Lambda
which contains a Try
which contains a Sequence
. Each level adds semantic meaning without syntactic overhead.
Cognitive Strategies
Perhaps most interestingly, XJSN can express reasoning strategies - allowing agents to describe not just what to think, but how to think:
ReasoningFramework({ strategy: AdaptiveChain({ initial: AssessComplexity({ dimensions: ["logical", "computational", "knowledge", "creative"], method: "ensemble" }), router: Lambda(["assessment"], Match({ pattern: assessment.profile, cases: [ { when: { logical: High(), computational: Low() }, then: SymbolicReasoning({ method: "natural_deduction", rules: LoadRuleset("domain_specific"), max_depth: 10, proof_strategy: "backward_chaining" }) }, { when: { computational: High() }, then: NumericSolver({ approach: Select({ linear: "simplex", nonlinear: "newton_raphson", discrete: "branch_and_bound", stochastic: "monte_carlo" }), precision: Digits(6), timeout: Seconds(5), parallel: true }) }, { when: { knowledge: High(), logical: Medium() }, then: HybridRetrieval({ semantic: VectorSearch({ index: "knowledge_base", top_k: 20, rerank: CrossEncoder() }), structured: GraphQuery({ traverse: "knowledge_graph", max_hops: 3, aggregate: "path_relevance" }), synthesis: WeightedMerge({ semantic: 0.6, structured: 0.4, validation: "cross_reference" }) }) }, { default: GeneralProblemSolver({ decompose: RecursiveDecomposition({ method: "functional", min_chunk: "atomic_operation", max_depth: 5 }), solve: MapReduce({ mapper: Ref("router"), // Recursive reference reducer: SynthesizeResults({ consistency_check: true, confidence_threshold: 0.8 }) }) }) } ] }) ), termination: FirstTrue([ SolutionFound({ confidence: GreaterThan(0.95) }), ResourceLimit({ time: Seconds(30), memory: Gigabytes(1) }), UserInterrupt() ]), trace: AuditLog({ level: "detailed", include: ["decisions", "backtracking", "resource_usage"] }) }) })
This framework describes an entire reasoning architecture. The agent assesses problem complexity across multiple dimensions, routes to specialized reasoning methods based on the assessment, and maintains termination conditions. The Ref("router")
creates a recursive structure where the problem solver can decompose problems and route sub-problems back through the same framework.
This isn't just configuration - it's a complete computational strategy expressed as data. An AI agent can generate variations of this framework, adapting its reasoning approach to different domains or requirements.
Synthesis: Cascade + XJSN
The true power emerges from combining Cascade's control flow with XJSN's expressiveness. Together, they create a system where agents can express and execute sophisticated operational patterns:
class XJSNAgent extends Agent { private interpreter = new XJSNInterpreter(); async handleAiMessage(messages) { const lastMessage = messages[messages.length - 1]; // Parse the AI's XJSN expression const thought = this.interpreter.parse(lastMessage.content); // Route based on expression type if (thought.type === "Orchestration") { return await this.executeOrchestration(thought, messages); } if (thought.type === "Reasoning") { return await this.executeReasoning(thought, messages); } if (thought.type === "Control") { return await this.executeControl(thought, messages); } // Default pass-through for standard responses return { messages }; } async executeOrchestration(orchestration, messages) { const { pipeline } = orchestration; // Set up transactional context if specified const txContext = pipeline.initialize?.type === "Transaction" ? await this.beginTransaction(pipeline.initialize) : null; // Execute pipeline stages with proper isolation const stages = pipeline.stages.map(stage => ({ execute: this.compileStage(stage), isolation: stage.transaction?.isolation || "read_committed", timeout: stage.transaction?.timeout || 30000 })); const results = []; for (const stage of stages) { try { const stageResult = await this.runIsolated(stage, messages); results.push(stageResult); // Check for early termination conditions if (this.shouldTerminate(stageResult, pipeline)) { break; } } catch (error) { if (pipeline.on_failure) { return this.executeCompensation( pipeline.on_failure, results, messages, error ); } throw error; } } // Commit transaction if active if (txContext) { await this.commitTransaction(txContext); } return { messages: [...messages, { role: "assistant", content: this.formatPipelineResult(results) }] }; } async executeReasoning(reasoning, messages) { const { strategy } = reasoning; const context = this.extractContext(messages); // Assess problem complexity using the specified method const assessment = await this.assess(strategy.initial, context); // Route to appropriate reasoning method based on assessment const method = await this.evaluateRouter(strategy.router, assessment); // Set up execution environment with tracing if requested const executionEnv = { method, context, trace: strategy.trace ? this.createTracer(strategy.trace) : null }; // Execute with termination conditions const result = await this.executeWithTermination( executionEnv, strategy.termination ); return { messages: [...messages, { role: "assistant", content: result.solution, metadata: { reasoning_trace: result.trace, confidence: result.confidence, resources_used: result.resources } }] }; } async executeControl(control, messages) { const { operation } = control; switch (operation.type) { case "Fork": // Create parallel execution branches const branches = await Promise.all( operation.branches.map(branch => this.executeBranch(branch, messages) ) ); return this.mergeBranches(branches, operation.merge_strategy); case "Checkpoint": // Save current state for potential rollback await this.saveCheckpoint(messages, operation.label); return { messages }; case "Rollback": // Restore from checkpoint const checkpoint = await this.loadCheckpoint(operation.to); return { messages: checkpoint.messages }; case "Switch": // Change execution context return this.switchContext(operation.context, messages); default: return { messages }; } } }
The agent is no longer just responding to messages - it's executing sophisticated computational patterns expressed in XJSN. The interpreter transforms XJSN expressions into executable operations. The Cascade handlers provide the control flow. Together, they create a system where agents can express and execute complex operational logic.
The Broader Implications
This architecture fundamentally changes how we think about agent development. Instead of asking "what tools does my agent need?", we ask "what language should my agent think in?"
For a compliance agent, that language includes predicates like RegulatoryRequirement
, AuditThreshold
, and EscalationPath
. For a code review agent, it includes PatternMatch
, ComplexityMetric
, and RefactoringStrategy
. For a research agent, it includes HypothesisFormulation
, EvidenceGathering
, and SynthesisStrategy
.
Each domain gets a vocabulary that maps naturally to its operational patterns. The vocabulary isn't just naming - it's computational. Each term has semantic meaning that translates to specific execution patterns.
This approach also solves the perennial problem of agent reliability. When agents express their intent in XJSN, we can validate it before execution. We can add guards and constraints. We can implement rollback and compensation. The non-determinism of LLM generation is separated from the determinism of execution.
The message list remains the canonical source of truth, but we now have sophisticated primitives for managing it. We can checkpoint and restore. We can filter and transform. We can isolate sub-computations. The message list becomes a transaction log rather than a monolithic context.
TypeScript provides our execution environment, but agents can express thoughts that compile to complex operations. The implicit graph of TypeScript control flow replaces the explicit graph of LangGraph. The expressiveness of XJSN replaces the rigidity of JSON tool calls.
Implementation Considerations
Building this system requires careful attention to several technical challenges:
Parsing and Validation
The XJSN parser must be robust enough to handle the imperfect output of LLMs while strict enough to prevent security issues. The parser operates in three phases:
Tokenization: Breaking the input into syntactic elements
AST Construction: Building the tree structure
Validation: Ensuring semantic correctness
The validation phase is critical. It must verify that:
All referenced functions exist in the current context
Arguments match expected types
No infinite recursion is possible
Resource limits are respected
Interpretation and Execution
The interpreter must map XJSN expressions to concrete operations. This requires a registry of available functions and their implementations. The interpreter should support:
Lazy evaluation for efficiency
Partial evaluation for debugging
Streaming execution for long-running operations
Rollback for error recovery
State Management
The Cascade runtime must manage state across handler invocations. This includes:
Working memory that persists between messages
Checkpoints for rollback capability
Isolation contexts for sub-computations
Resource tracking for cost management
Error Handling
Errors can occur at multiple levels:
Parse errors in XJSN expressions
Validation errors in arguments
Runtime errors during execution
Resource exhaustion
Each level needs appropriate handling, from gentle correction (helping the AI fix malformed XJSN) to hard stops (preventing resource exhaustion).
Future Directions
This work opens several avenues for exploration:
Standard Library Development
Different domains need different XJSN vocabularies. We could develop standard libraries for common patterns:
Business process management
Data pipeline orchestration
UI generation and interaction
Scientific computation
Creative content generation
Visual Programming Integration
XJSN's tree structure maps naturally to visual programming. We could build visual editors that generate XJSN, allowing non-programmers to design agent behaviors.
Formal Verification
Because XJSN expressions are data, we can analyze them formally. We could prove properties like termination, resource bounds, or semantic correctness.
Cross-Agent Communication
Agents could communicate by exchanging XJSN expressions, creating a lingua franca for agent interaction. This would enable sophisticated multi-agent systems without tight coupling.
Conclusion
The question isn't whether your agent can call tools. It's whether your agent can think in a language expressive enough for its domain.
Current frameworks force agents to express complex operational logic through simple tool calls - like writing poetry with emoji. The result is fragile, hard to debug, and impossible to scale beyond toy examples.
The Cascade pattern provides clean control flow without the complexity of state machines. XJSN provides expressiveness without the dangers of arbitrary code execution. Together, they enable agents that can express and execute sophisticated operational patterns.
This isn't about building better agents. It's about giving agents better languages to think in. When we do that, complex patterns that previously required extensive framework machinery become simple expressions of intent.
The implementation is available at [github.com/...]. The XJSN parser handles recursive expressions, lambda functions, and conditional evaluation. The Cascade runtime provides hooks for all three message types with proper async handling and error boundaries.
The future of agent development isn't just about better models or more tools. It's about creating languages that allow agents to express the full richness of computational thought. Only then can we build agents that truly augment human capability rather than merely simulating it.