Dec 10, 2024

Articles

Some thoughts on Intelligence Design

Some thoughts on Intelligence Design

Some thoughts on Intelligence Design

Will Chen

Introduction

When ChatGPT launched, I was skeptical. Having spent months with GPT-3's playground writing cheesy AI poetry, I'd seen impressive but ultimately shallow text generation. But ChatGPT was different - instead of cleverly crafting incomplete text to coax out a helpful response, you could just talk to it. The shift from text completion to conversation changed everything. Suddenly our collective imaginations were propelled 25 years into the future abruptly and involuntarily, as if by some alien technology that was made accessible to us, we could envision apps that truly understood us, that could think and reason alongside us. A brave new world to embrace.

We’ve since normalized the use of LLMs — school kids using it everyday for homework, bespoke generated images plastered on every marketing campaign, and a horde of developers becoming AI-augmented cyborgs through tools like Cursor and Windsurf.

Today, developers are drowning in new AI frameworks and tools - LangChain, AutoGen, countless others promising to make AI development easier. But watch any developer build an AI app today and you'll see them wrestling with prompt engineering and complex frameworks, fighting implementation details that feel irrelevant to the intelligence they're trying to create. Despite all our progress, we're still missing something fundamental about how to work with this technology.

What it’s like to build AI apps today

Working with LLMs today — whether you’re building coding assistant, a research tool, or an analysis engine, — forces you to structure everything as a ChatGPT-style conversation. This isn't a coincidence - it's because the underlying APIs were designed around chat interactions. Send a message, get a response. Text input, text output.

[diagram: show OpenAI API spec]

This messaging pattern makes perfect sense for an app like ChatGPT. But building apps with these primitives feel like trying to contort interaction into this format. Want to analyze a document? Turn it into a message. Need to process data? Make it a conversation. Building a coding assistant? More messages. The tools and frameworks all center around prompt chains and message flows, even when what you're building isn't conversational at all.

[image: LangGraph code]

But take something as basic as letting AI “access your documents”. This is a fairly common request - people constantly want AI to help understand their research, find patterns in their writing, connect ideas across their notes. Search online for how to build AI apps and you’ll be met with**: vector databases, embedding pipelines, eval suites, multi-agent systems.** It turns into a full-blown science project of measuring similarity scores, evaluating retrieval quality, instrumenting every piece of the pipeline.

[image: search results for YouTube tutorial]

Is this what we should be focused on when designing a simple system for AI to read and understand documents? Is this the first thing that you should be worried about?

The gap between whiteboard and keyboard

Watch how teams design AI systems — be they technical or not, most people naturally start off speaking the same language when describing what they want AI to do. 'It should understand this context.' 'It needs to make this decision.' 'The information should flow here.' There’s an emerging universal vocabulary for discussing AI capabilities and expectations.


https://x.com/virattt/status/1870513142981353703

Early frameworks like LangChain tried to systematize how we work with LLMs. They experimented with different models of computation - systems of prompts, chains of operations, ways to make AI behavior predictable. Looking under the hood reveals the messy complexity of this approach: carefully crafted prompts begging the model to return valid JSON, fallback strategies when simpler models fail, layers of abstraction hiding an intricate dance of prompt engineering and hopeful prayers.

Today, 2 years after the release of ChatGPT, we have enough real data about what works and what doesn’t. Patterns of best practices have established a growing consensus about what actually matters when building AI apps — and a new discipline is forming around intelligence design. Let’s now imagine a fresh conceptual model — a “language” from first principles, starting with how people naturally think about and work with AI.

It’s time to reimagine how to build things with AI from the ground up.

Context

We can start with the basics — what we can't change: the fundamental API of large language models. At the end of the day, until the large AI research labs give us something new, we're going to be working with models that take text input and produce text output. This is our base layer - our assembly language. All existing LLM frameworks today are built on top of context manipulation, providing the right context to make these calls.

What is context in an AI application? Not just prompt prefixes or vector lookups. Context represents the living, breathing information state that drives intelligence. Every piece of information that matters for the next decision, every nuance that shapes understanding, every detail that influences behavior - that's context.

Context, not prompt engineering

You might recall some buzz about "prompt engineering" - the art of crafting the perfect instructions. What’s we’ve learned is that as models have matured and stabilized, fiddling with exact prompt wording has become less critical. What matters more is prompt parameterization: are we giving the model the right information to work with? Even the best prompt can't help if the model lacks crucial context, yet simple prompts are good enough when the model has the right context to draw from — they can always be tuned to be better. This is our first critical insight.

When an AI application fails, it probably isn’t because the model isn't capable of the task — and this will be even less of a concern given the pace of intelligence increase. Rather, it's often due to a fault in provided context: working with outdated information, lacking important details, or missing relevant background. The intelligence is there; we just need to feed it the right material.

[image: show comparison of prompt engineering vs. context inputs/outputs]

This shift is significant. It's about context management - ensuring the model has the right information at the right time to perform its task. When context becomes your foundation, things become a lot clearer. Every operation either builds context, transforms it, filters it, or uses it.

You can chain these context operations together naturally: filter by recency, augment with related information, limit to the most relevant pieces. Each operation is simple and clear, but together they create sophisticated context manipulation that drives intelligent behavior.

Context explains Memory

"I wish AI would just remember things." This is probably the most common frustration with current AI assistants. You have a long conversation with ChatGPT or Claude, teaching it about your preferences or project, only to have that understanding vanish when you start a new chat. The dream of AI that learns and adapts feels just out of reach.

There are many techniques that are available today to augment AI with memory. Knowledge bases, vector stores, conversation buffers, query memories - frameworks like LangChain offer numerous "memory" types. But these are implementation details, not design patterns. They don't help us think clearly about what memory actually means for AI systems.

Inspect further, and you'll see that memory is really just another form of context management. When we say we want AI to "remember" something, what we actually want is for relevant information to be available when needed. More than simply storing facts, the AI system needs to integrate the persisted data within code structures that implements its worldview.

[diagram: show how LangChain obscures ACTUAL memory context design — conversationalbuffermemory]

A coding assistant stores project structure and connects with the model's understanding of code architecture, design patterns, and best practices. A research aide needs to archive documents as well as surface relationships and insights that align with its understanding of academic knowledge and reasoning. A personal assistant needs to keep track of user activity and preferences and correlate them with human behavior and decisions.

This context-first view explains why good one-size-fits-all memory solutions don't yet exist. Different applications need different types of "memory" because they need to treat context very differently. Each requires its own patterns for storing, updating, and connecting information.

Context explains RAG

If you've been following AI development, you've probably heard of RAG (Retrieval-Augmented Generation). It's become somewhat of a buzzword; it’s not uncommon to see people rushing to set up vector databases, knowledge graphs for AI projects before they know what they’re doing. Underneath that hype, RAG is actually the most intuitive thing in the world — it's just about giving AI relevant context when and where it needs it.

You can think of RAG like building a “context lens” - something that focuses and amplifies relevant bits of information for AI calls. We often overcomplicate this because we're thinking like humans rather than LLMs, assuming we need to carefully format and structure everything. But these models actually tolerate messy, poorly formatted data quite well. They'll often surprise you by amplifying coherence in amazing ways when given the right context.

[diagram: show a RAG pipeline]

This realization frees us from worrying about implementation details too early. How that context eventually gets mapped into messages or API calls? That's an engineering concern. The intelligence design should focus on what context we need and when we need it.

Context explains UI

This context-oriented perspective helps us understand the role of components in AI applications. Take user interfaces - they're not just about capturing explicit inputs and interactions. Now, they also double as context collection points, constantly refining the AI's understanding in real-time. Every user action, every document change, every system event becomes a potential context source. This is what people miss when they dismiss something as "just another GPT wrapper." The magic isn’t in hiding away prompts, but in orchestrating the nuanced capture of activity and constructing context in the background.

[animation: show AI apps like cursor tab]

Look at Cursor, for example. What makes it feel magical isn't just that it calls GPT-4. It's that it's constantly tracking your movement, your browsing, your typing patterns. When you hit tab to complete code and jump around, it’s leveraging rich context about your project, your recent edits, your coding style. The fluid experience comes from smart context management from good engineering, not clever prompts.

Decisions

Beyond context management, there's another fundamental primitive we get from modern AI APIs: tool calling. When you make an LLM call, you can provide a description of available tools - essentially asking the AI to choose from a set of possible actions. The API enforces this through JSON schemas, giving us guaranteed structure in how AI makes these choices.

[image: show tool calling]

This capability has led to the concept of "AI agents" - systems that can use tools to accomplish tasks. But current frameworks make this surprisingly complex. Look at any LangGraph implementation and you'll see a maze of tool nodes, conditional edges, and state management - all just to let AI make and execute choices.

The complexity comes from focusing on the wrong abstraction. We're building infrastructure around tools when what we really want is decisions. When AI uses a tool, what's actually happening? It's making a choice - deciding that some action is appropriate given its current context. The tool calling mechanism is just implementation detail - a way to make that decision executable.

Decisions over tool-calling

This realization points to another fundamental principle of intelligence design: decisions are more natural building blocks than tools. Think about how you describe what you want an AI system to do: "It should decide when to search for more information." "It needs to choose between summarizing or asking for clarification." "It should determine if the answer is complete."

The questions that matter in design become clearer:

  • What decisions does the system need to make?

  • When should these decisions happen?

  • What context should inform each choice?

  • What options should be available?

  • What reasoning, if any, should be captured?

This matches how we naturally think about intelligence. We don't think in terms of tool registration and routing logic - we think about decision points and their implications. The technical details of how those decisions get implemented through tool calls becomes secondary to the core design of the system's decision-making capability.

What choices does your AI system need to make? When does it need to make them? What context informs those choices? These questions lead to clearer designs because they match how we naturally think about intelligence.

In a decision-first design, you map out decision points: "Should we search for more information here? Should we summarize this content? Should we ask for clarification?" Each decision point becomes clear, contextual, purposeful. The technical implementation of how these decisions get executed becomes secondary to the intelligence design.

Understanding Agents

Ask five developers what an "agent" is, and you might get five different answers - each focusing on different implementation details like tool usage, memory systems, or state management. Why? Because we're trying to define agents by their implementation details instead of their decision patterns.

In current frameworks, agents are complex technical constructs: typically a state machine that manages tools, handles memory, routes responses, and coordinates actions. No wonder people are confused! We've buried the essential nature of agency — the ability to make and execute decisions - under layers of technical complexity.

Strip away the hype, and an agent is simply a decision loop: observe, decide, act. The AI receives context, makes a decision, and sometimes executes that decision through tool use. That's it. The implementation details - whether it uses tools, how it manages state, what framework it uses - these become secondary to the core intelligence design.

[image: show a basic decision loop]

Take the classic ReAct pattern: the AI reasons about its situation, chooses an action, observes the result, and repeats. This isn't magical - it's just a structured way to make decisions with feedback. Even "multi-agent systems," despite their complexity, are just multiple decision loops interacting. The sophisticated-looking frameworks and architectures don't actually give you more control over the intelligence design - they just add complexity to this basic pattern.

[react code]

Decision-based Architecture

A simpler architecture emerges when you focus on decisions:

[DIAGRAM: contextualize all points in an agent flow]

Decision Points:

  • What context is needed

  • What choices are available

  • What outcomes are possible

Execution:

  • How decisions become actions

  • How results feed back

  • How to handle failures

State:

  • What context to maintain

  • What history matters

  • How results affect future decisions

This isn't about replacing existing patterns - tools, chains, and routing all serve a purpose at the implementation level. But by thinking in decisions first, we can design AI systems that match how we actually want them to behave.

Experimental: Grammars-of-Thought

Building off decisions as a foundation enables exciting explorations that are hard to conceive of otherwise in the industry’s current frame of chains, agents, and tools.

Decision patterns naturally evolve into a “grammar” for intelligence design. Like any language, these patterns have rules that govern how thoughts combine and flow. The discovery opens up entirely new ways to think about AI systems.

Consider how humans make complex decisions. We don't follow rigid flowcharts - we use flexible patterns of thinking. Sometimes we reason step-by-step (chain-of-thought). Other times we plan then execute. Sometimes we react to new information and adapt. These strategies are cognitive grammar rules, patterns that guide how decisions flow and combine.

[Diagram: Show three parallel decision patterns:

  • Chain-of-thought: A->B->C (linear reasoning)

  • Plan-and-execute: Plan(A,B,C)->Execute

  • React: Observe->Think->Act->Loop]

Current agent frameworks struggle to capture this fluidity. Their rigid graphs of nodes and edges try to map every possible path, missing the fundamental strength of language models: their ability to understand and apply patterns with flexibility and nuance.

What if instead of hardcoding decision trees, we gave our AI systems grammars of thought?

Here's how it might work: A grammar-based approach changes everything. Rather than specifying exact sequences, we define the rules of thought:


Such notation reshapes how we design agent behavior. The system generates valid "sentences" in this decision language, with each pattern rule becoming a composable unit of thought. Want sophisticated behavior? Simply combine grammar rules - perhaps an agent that plans broadly, then uses reactive patterns to handle details.

The magic happens when you translate these grammars into executable agent orchestrations. The system optimizes these decision patterns, pruning redundant steps while preserving the grammar's guarantees. No invalid sequences can emerge, yet the AI retains freedom to operate creatively within these constraints — controllable and expressive. It's like having guardrails for intelligence that don't restrict its power.

[Diagram: Show grammar rules compiling into optimized decision flow, with unnecessary steps eliminated and parallel paths merged]

This approach resolves a core tension in agent design: the balance between control and flexibility. Where traditional frameworks force a choice between rigid graphs or unpredictable behavior, grammar-based decision patterns give you both - the AI can be creative within well-defined patterns of thought.

Metaprogramming

This grammar-based approach scales elegantly with complexity. While traditional agent frameworks buckle under expanding decision trees - drowning in states, edges, and failure points - grammars thrive through composition. Like natural languages expressing infinite ideas through finite rules, adding new capabilities simply means introducing new grammar rules. The underlying patterns remain clean and manageable.

An exciting possibility this unlocks is performant AI metaprogramming. When decision patterns at this level become manipulatable as data, AI systems gain the ability to transform and optimize their own behavior patterns. Agents can learn new grammar rules from trial and error, refine their decision patterns through use, and discover novel combinations of existing patterns. The grammar serves simultaneously as code and data, defining a natural path to recursive self-improvement.

[image: AI modifying its own patterns — A → A’]

With the proper feedback mechanisms, AIs can writing and refining their own playbooks. The agents of today rely on predetermined state machine nodes and edges, it composes and evolves decision patterns optimized for its tasks. In this framework, we can leverage AI intelligence in discovering efficient patterns, combining rules in unexpected ways, and generating novel grammar productions we haven’t thought of.

We're just beginning to explore this territory, but the implications are profound. Just as programming languages evolved from machine code to high-level abstractions, intelligence design could evolve from tool-calling to grammars of thought. The future of AI development might not be about writing decision trees, but about designing cognitive grammars that shape how artificial minds think.

What most AI frameworks get wrong

Here's a counterintuitive truth I've discovered: making intelligence patterns more explicit actually makes AI development more accessible, not less. This sounds wrong until you really think about it. We're learning that with AI, the complexity you need to be focusing on isn't in the implementation - it's in the intelligence design itself. When we obscure important aspects of that design using abstractions intended to simplify implementation, we make everything harder.

Everyone naturally understands how context and decisions work in their own thinking. When you ask someone how they research a topic, they don't talk naturally start with vector databases and embedding pipelines. They describe gathering relevant information, deciding what matters, making connections, synthesizing understanding. These are the real primitives of intelligence - and they're intuitive to everyone.

People complain about the abstractions in current frameworks like LangChain, but the problem isn't that they're too many of them - it's that they’re using these abstractions to hide the wrong things. They expose implementation details while hiding the patterns that actually matter: context flows, decision points, intelligence composition. It’s time we realized that AI apps aren’t just normal apps with AI features sprinkled here and there — that we need a new paradigm for how to designing these applications.

Visual tools for AI

There's no shortage of visual tools for building stuff with AI today. However, upon inspection you'll see they're essentially visual programming tools with AI components - nodes and edges representing code operations. But intelligence design needs its own visual grammar, one that matches how we think about and discuss AI systems.

[image: collage compilation of various AI tools]

Every visual tool encodes a language through its metaphors and patterns. Spreadsheets use cells and formulas to express data relationships. CAD software uses geometric primitives to express physical design. AI, too will needs its own visual language - one that can bridge the gap between whiteboard discussions and working implementations.

Traditional frameworks force us to express AI systems through code - chains, agents, tools. But when you watch people design AI systems, they naturally draw flows of information, decision points, feedback loops. This disconnect between how we think about AI and how we implement it isn't just inconvenient - it's blocking innovation.

The visual tools we have today show us implementation details - arrows connecting nodes that represent code operations, memory buffers, state transitions. But what if our visual language focused on intelligence patterns instead? What if we could see context flowing through our system, watch decision points shape that flow, track how information transforms and combines?

Yes, pretty diagrams make things easier. But it's more about having a visual grammar that matches how we think about intelligence. When you can see context building up, when you can trace decision flows, when you can watch information transform - you start to spot patterns you'd miss in code. You see opportunities for composition that implementation details obscure.

Intelligence Design is Visual

This visual approach to intelligence design points to a broader shift needed in AI development. The frameworks we have today served their purpose - they helped us understand what's possible with LLMs and gave us initial patterns to build from. But continuing down this path means accepting their limitations, their complexities, their fundamental mismatch with how we think about intelligence.

We need to:

  • Make intelligence patterns our primary building blocks

  • Design tools that match natural cognitive models

  • Let implementation details serve the design, not dictate it

The future of AI development isn't about more complex frameworks or clever abstractions. It's about working with the right primitives - context, decisions, flows. These aren't just implementation patterns - they're the building blocks of intelligence design itself.

What we’re building at Idyllic

At Idyllic, we're exploring how to make these ideas concrete. We believe the future of AI development needs new primitives that match how we naturally think about intelligence. While visual tools are part of the solution, the deeper challenge is creating the right abstractions - ones that let us express intelligence flows as naturally as we imagine them.

We're starting with the fundamentals: rethinking how context flows through systems, how decisions shape that flow, and how intelligence emerges from their composition. Our early experiments suggest that when you get these primitives right, building AI applications becomes more intuitive, more powerful, and more accessible to everyone who has ideas about how intelligence should work.

Intelligence design is emerging as its own discipline, distinct from traditional software development. It needs its own tools, patterns, and ways of thinking. This is what we're researching at Idyllic - starting with a visual grammar that matches how we naturally reason about intelligence.

Our early work compiles to existing frameworks, but that's just the beginning. We're creating a platform where intelligence design becomes concrete, where visual patterns become working systems, where ideas flow naturally from conception to implementation.

If you're interested in shaping how we build with AI, visit https://idyllic.so. Whether you're a developer frustrated with current tools, a designer who sees AI's potential, or someone who believes AI development can be more intuitive - we'd love to hear from you.

Introduction

When ChatGPT launched, I was skeptical. Having spent months with GPT-3's playground writing cheesy AI poetry, I'd seen impressive but ultimately shallow text generation. But ChatGPT was different - instead of cleverly crafting incomplete text to coax out a helpful response, you could just talk to it. The shift from text completion to conversation changed everything. Suddenly our collective imaginations were propelled 25 years into the future abruptly and involuntarily, as if by some alien technology that was made accessible to us, we could envision apps that truly understood us, that could think and reason alongside us. A brave new world to embrace.

We’ve since normalized the use of LLMs — school kids using it everyday for homework, bespoke generated images plastered on every marketing campaign, and a horde of developers becoming AI-augmented cyborgs through tools like Cursor and Windsurf.

Today, developers are drowning in new AI frameworks and tools - LangChain, AutoGen, countless others promising to make AI development easier. But watch any developer build an AI app today and you'll see them wrestling with prompt engineering and complex frameworks, fighting implementation details that feel irrelevant to the intelligence they're trying to create. Despite all our progress, we're still missing something fundamental about how to work with this technology.

What it’s like to build AI apps today

Working with LLMs today — whether you’re building coding assistant, a research tool, or an analysis engine, — forces you to structure everything as a ChatGPT-style conversation. This isn't a coincidence - it's because the underlying APIs were designed around chat interactions. Send a message, get a response. Text input, text output.

[diagram: show OpenAI API spec]

This messaging pattern makes perfect sense for an app like ChatGPT. But building apps with these primitives feel like trying to contort interaction into this format. Want to analyze a document? Turn it into a message. Need to process data? Make it a conversation. Building a coding assistant? More messages. The tools and frameworks all center around prompt chains and message flows, even when what you're building isn't conversational at all.

[image: LangGraph code]

But take something as basic as letting AI “access your documents”. This is a fairly common request - people constantly want AI to help understand their research, find patterns in their writing, connect ideas across their notes. Search online for how to build AI apps and you’ll be met with**: vector databases, embedding pipelines, eval suites, multi-agent systems.** It turns into a full-blown science project of measuring similarity scores, evaluating retrieval quality, instrumenting every piece of the pipeline.

[image: search results for YouTube tutorial]

Is this what we should be focused on when designing a simple system for AI to read and understand documents? Is this the first thing that you should be worried about?

The gap between whiteboard and keyboard

Watch how teams design AI systems — be they technical or not, most people naturally start off speaking the same language when describing what they want AI to do. 'It should understand this context.' 'It needs to make this decision.' 'The information should flow here.' There’s an emerging universal vocabulary for discussing AI capabilities and expectations.


https://x.com/virattt/status/1870513142981353703

Early frameworks like LangChain tried to systematize how we work with LLMs. They experimented with different models of computation - systems of prompts, chains of operations, ways to make AI behavior predictable. Looking under the hood reveals the messy complexity of this approach: carefully crafted prompts begging the model to return valid JSON, fallback strategies when simpler models fail, layers of abstraction hiding an intricate dance of prompt engineering and hopeful prayers.

Today, 2 years after the release of ChatGPT, we have enough real data about what works and what doesn’t. Patterns of best practices have established a growing consensus about what actually matters when building AI apps — and a new discipline is forming around intelligence design. Let’s now imagine a fresh conceptual model — a “language” from first principles, starting with how people naturally think about and work with AI.

It’s time to reimagine how to build things with AI from the ground up.

Context

We can start with the basics — what we can't change: the fundamental API of large language models. At the end of the day, until the large AI research labs give us something new, we're going to be working with models that take text input and produce text output. This is our base layer - our assembly language. All existing LLM frameworks today are built on top of context manipulation, providing the right context to make these calls.

What is context in an AI application? Not just prompt prefixes or vector lookups. Context represents the living, breathing information state that drives intelligence. Every piece of information that matters for the next decision, every nuance that shapes understanding, every detail that influences behavior - that's context.

Context, not prompt engineering

You might recall some buzz about "prompt engineering" - the art of crafting the perfect instructions. What’s we’ve learned is that as models have matured and stabilized, fiddling with exact prompt wording has become less critical. What matters more is prompt parameterization: are we giving the model the right information to work with? Even the best prompt can't help if the model lacks crucial context, yet simple prompts are good enough when the model has the right context to draw from — they can always be tuned to be better. This is our first critical insight.

When an AI application fails, it probably isn’t because the model isn't capable of the task — and this will be even less of a concern given the pace of intelligence increase. Rather, it's often due to a fault in provided context: working with outdated information, lacking important details, or missing relevant background. The intelligence is there; we just need to feed it the right material.

[image: show comparison of prompt engineering vs. context inputs/outputs]

This shift is significant. It's about context management - ensuring the model has the right information at the right time to perform its task. When context becomes your foundation, things become a lot clearer. Every operation either builds context, transforms it, filters it, or uses it.

You can chain these context operations together naturally: filter by recency, augment with related information, limit to the most relevant pieces. Each operation is simple and clear, but together they create sophisticated context manipulation that drives intelligent behavior.

Context explains Memory

"I wish AI would just remember things." This is probably the most common frustration with current AI assistants. You have a long conversation with ChatGPT or Claude, teaching it about your preferences or project, only to have that understanding vanish when you start a new chat. The dream of AI that learns and adapts feels just out of reach.

There are many techniques that are available today to augment AI with memory. Knowledge bases, vector stores, conversation buffers, query memories - frameworks like LangChain offer numerous "memory" types. But these are implementation details, not design patterns. They don't help us think clearly about what memory actually means for AI systems.

Inspect further, and you'll see that memory is really just another form of context management. When we say we want AI to "remember" something, what we actually want is for relevant information to be available when needed. More than simply storing facts, the AI system needs to integrate the persisted data within code structures that implements its worldview.

[diagram: show how LangChain obscures ACTUAL memory context design — conversationalbuffermemory]

A coding assistant stores project structure and connects with the model's understanding of code architecture, design patterns, and best practices. A research aide needs to archive documents as well as surface relationships and insights that align with its understanding of academic knowledge and reasoning. A personal assistant needs to keep track of user activity and preferences and correlate them with human behavior and decisions.

This context-first view explains why good one-size-fits-all memory solutions don't yet exist. Different applications need different types of "memory" because they need to treat context very differently. Each requires its own patterns for storing, updating, and connecting information.

Context explains RAG

If you've been following AI development, you've probably heard of RAG (Retrieval-Augmented Generation). It's become somewhat of a buzzword; it’s not uncommon to see people rushing to set up vector databases, knowledge graphs for AI projects before they know what they’re doing. Underneath that hype, RAG is actually the most intuitive thing in the world — it's just about giving AI relevant context when and where it needs it.

You can think of RAG like building a “context lens” - something that focuses and amplifies relevant bits of information for AI calls. We often overcomplicate this because we're thinking like humans rather than LLMs, assuming we need to carefully format and structure everything. But these models actually tolerate messy, poorly formatted data quite well. They'll often surprise you by amplifying coherence in amazing ways when given the right context.

[diagram: show a RAG pipeline]

This realization frees us from worrying about implementation details too early. How that context eventually gets mapped into messages or API calls? That's an engineering concern. The intelligence design should focus on what context we need and when we need it.

Context explains UI

This context-oriented perspective helps us understand the role of components in AI applications. Take user interfaces - they're not just about capturing explicit inputs and interactions. Now, they also double as context collection points, constantly refining the AI's understanding in real-time. Every user action, every document change, every system event becomes a potential context source. This is what people miss when they dismiss something as "just another GPT wrapper." The magic isn’t in hiding away prompts, but in orchestrating the nuanced capture of activity and constructing context in the background.

[animation: show AI apps like cursor tab]

Look at Cursor, for example. What makes it feel magical isn't just that it calls GPT-4. It's that it's constantly tracking your movement, your browsing, your typing patterns. When you hit tab to complete code and jump around, it’s leveraging rich context about your project, your recent edits, your coding style. The fluid experience comes from smart context management from good engineering, not clever prompts.

Decisions

Beyond context management, there's another fundamental primitive we get from modern AI APIs: tool calling. When you make an LLM call, you can provide a description of available tools - essentially asking the AI to choose from a set of possible actions. The API enforces this through JSON schemas, giving us guaranteed structure in how AI makes these choices.

[image: show tool calling]

This capability has led to the concept of "AI agents" - systems that can use tools to accomplish tasks. But current frameworks make this surprisingly complex. Look at any LangGraph implementation and you'll see a maze of tool nodes, conditional edges, and state management - all just to let AI make and execute choices.

The complexity comes from focusing on the wrong abstraction. We're building infrastructure around tools when what we really want is decisions. When AI uses a tool, what's actually happening? It's making a choice - deciding that some action is appropriate given its current context. The tool calling mechanism is just implementation detail - a way to make that decision executable.

Decisions over tool-calling

This realization points to another fundamental principle of intelligence design: decisions are more natural building blocks than tools. Think about how you describe what you want an AI system to do: "It should decide when to search for more information." "It needs to choose between summarizing or asking for clarification." "It should determine if the answer is complete."

The questions that matter in design become clearer:

  • What decisions does the system need to make?

  • When should these decisions happen?

  • What context should inform each choice?

  • What options should be available?

  • What reasoning, if any, should be captured?

This matches how we naturally think about intelligence. We don't think in terms of tool registration and routing logic - we think about decision points and their implications. The technical details of how those decisions get implemented through tool calls becomes secondary to the core design of the system's decision-making capability.

What choices does your AI system need to make? When does it need to make them? What context informs those choices? These questions lead to clearer designs because they match how we naturally think about intelligence.

In a decision-first design, you map out decision points: "Should we search for more information here? Should we summarize this content? Should we ask for clarification?" Each decision point becomes clear, contextual, purposeful. The technical implementation of how these decisions get executed becomes secondary to the intelligence design.

Understanding Agents

Ask five developers what an "agent" is, and you might get five different answers - each focusing on different implementation details like tool usage, memory systems, or state management. Why? Because we're trying to define agents by their implementation details instead of their decision patterns.

In current frameworks, agents are complex technical constructs: typically a state machine that manages tools, handles memory, routes responses, and coordinates actions. No wonder people are confused! We've buried the essential nature of agency — the ability to make and execute decisions - under layers of technical complexity.

Strip away the hype, and an agent is simply a decision loop: observe, decide, act. The AI receives context, makes a decision, and sometimes executes that decision through tool use. That's it. The implementation details - whether it uses tools, how it manages state, what framework it uses - these become secondary to the core intelligence design.

[image: show a basic decision loop]

Take the classic ReAct pattern: the AI reasons about its situation, chooses an action, observes the result, and repeats. This isn't magical - it's just a structured way to make decisions with feedback. Even "multi-agent systems," despite their complexity, are just multiple decision loops interacting. The sophisticated-looking frameworks and architectures don't actually give you more control over the intelligence design - they just add complexity to this basic pattern.

[react code]

Decision-based Architecture

A simpler architecture emerges when you focus on decisions:

[DIAGRAM: contextualize all points in an agent flow]

Decision Points:

  • What context is needed

  • What choices are available

  • What outcomes are possible

Execution:

  • How decisions become actions

  • How results feed back

  • How to handle failures

State:

  • What context to maintain

  • What history matters

  • How results affect future decisions

This isn't about replacing existing patterns - tools, chains, and routing all serve a purpose at the implementation level. But by thinking in decisions first, we can design AI systems that match how we actually want them to behave.

Experimental: Grammars-of-Thought

Building off decisions as a foundation enables exciting explorations that are hard to conceive of otherwise in the industry’s current frame of chains, agents, and tools.

Decision patterns naturally evolve into a “grammar” for intelligence design. Like any language, these patterns have rules that govern how thoughts combine and flow. The discovery opens up entirely new ways to think about AI systems.

Consider how humans make complex decisions. We don't follow rigid flowcharts - we use flexible patterns of thinking. Sometimes we reason step-by-step (chain-of-thought). Other times we plan then execute. Sometimes we react to new information and adapt. These strategies are cognitive grammar rules, patterns that guide how decisions flow and combine.

[Diagram: Show three parallel decision patterns:

  • Chain-of-thought: A->B->C (linear reasoning)

  • Plan-and-execute: Plan(A,B,C)->Execute

  • React: Observe->Think->Act->Loop]

Current agent frameworks struggle to capture this fluidity. Their rigid graphs of nodes and edges try to map every possible path, missing the fundamental strength of language models: their ability to understand and apply patterns with flexibility and nuance.

What if instead of hardcoding decision trees, we gave our AI systems grammars of thought?

Here's how it might work: A grammar-based approach changes everything. Rather than specifying exact sequences, we define the rules of thought:


Such notation reshapes how we design agent behavior. The system generates valid "sentences" in this decision language, with each pattern rule becoming a composable unit of thought. Want sophisticated behavior? Simply combine grammar rules - perhaps an agent that plans broadly, then uses reactive patterns to handle details.

The magic happens when you translate these grammars into executable agent orchestrations. The system optimizes these decision patterns, pruning redundant steps while preserving the grammar's guarantees. No invalid sequences can emerge, yet the AI retains freedom to operate creatively within these constraints — controllable and expressive. It's like having guardrails for intelligence that don't restrict its power.

[Diagram: Show grammar rules compiling into optimized decision flow, with unnecessary steps eliminated and parallel paths merged]

This approach resolves a core tension in agent design: the balance between control and flexibility. Where traditional frameworks force a choice between rigid graphs or unpredictable behavior, grammar-based decision patterns give you both - the AI can be creative within well-defined patterns of thought.

Metaprogramming

This grammar-based approach scales elegantly with complexity. While traditional agent frameworks buckle under expanding decision trees - drowning in states, edges, and failure points - grammars thrive through composition. Like natural languages expressing infinite ideas through finite rules, adding new capabilities simply means introducing new grammar rules. The underlying patterns remain clean and manageable.

An exciting possibility this unlocks is performant AI metaprogramming. When decision patterns at this level become manipulatable as data, AI systems gain the ability to transform and optimize their own behavior patterns. Agents can learn new grammar rules from trial and error, refine their decision patterns through use, and discover novel combinations of existing patterns. The grammar serves simultaneously as code and data, defining a natural path to recursive self-improvement.

[image: AI modifying its own patterns — A → A’]

With the proper feedback mechanisms, AIs can writing and refining their own playbooks. The agents of today rely on predetermined state machine nodes and edges, it composes and evolves decision patterns optimized for its tasks. In this framework, we can leverage AI intelligence in discovering efficient patterns, combining rules in unexpected ways, and generating novel grammar productions we haven’t thought of.

We're just beginning to explore this territory, but the implications are profound. Just as programming languages evolved from machine code to high-level abstractions, intelligence design could evolve from tool-calling to grammars of thought. The future of AI development might not be about writing decision trees, but about designing cognitive grammars that shape how artificial minds think.

What most AI frameworks get wrong

Here's a counterintuitive truth I've discovered: making intelligence patterns more explicit actually makes AI development more accessible, not less. This sounds wrong until you really think about it. We're learning that with AI, the complexity you need to be focusing on isn't in the implementation - it's in the intelligence design itself. When we obscure important aspects of that design using abstractions intended to simplify implementation, we make everything harder.

Everyone naturally understands how context and decisions work in their own thinking. When you ask someone how they research a topic, they don't talk naturally start with vector databases and embedding pipelines. They describe gathering relevant information, deciding what matters, making connections, synthesizing understanding. These are the real primitives of intelligence - and they're intuitive to everyone.

People complain about the abstractions in current frameworks like LangChain, but the problem isn't that they're too many of them - it's that they’re using these abstractions to hide the wrong things. They expose implementation details while hiding the patterns that actually matter: context flows, decision points, intelligence composition. It’s time we realized that AI apps aren’t just normal apps with AI features sprinkled here and there — that we need a new paradigm for how to designing these applications.

Visual tools for AI

There's no shortage of visual tools for building stuff with AI today. However, upon inspection you'll see they're essentially visual programming tools with AI components - nodes and edges representing code operations. But intelligence design needs its own visual grammar, one that matches how we think about and discuss AI systems.

[image: collage compilation of various AI tools]

Every visual tool encodes a language through its metaphors and patterns. Spreadsheets use cells and formulas to express data relationships. CAD software uses geometric primitives to express physical design. AI, too will needs its own visual language - one that can bridge the gap between whiteboard discussions and working implementations.

Traditional frameworks force us to express AI systems through code - chains, agents, tools. But when you watch people design AI systems, they naturally draw flows of information, decision points, feedback loops. This disconnect between how we think about AI and how we implement it isn't just inconvenient - it's blocking innovation.

The visual tools we have today show us implementation details - arrows connecting nodes that represent code operations, memory buffers, state transitions. But what if our visual language focused on intelligence patterns instead? What if we could see context flowing through our system, watch decision points shape that flow, track how information transforms and combines?

Yes, pretty diagrams make things easier. But it's more about having a visual grammar that matches how we think about intelligence. When you can see context building up, when you can trace decision flows, when you can watch information transform - you start to spot patterns you'd miss in code. You see opportunities for composition that implementation details obscure.

Intelligence Design is Visual

This visual approach to intelligence design points to a broader shift needed in AI development. The frameworks we have today served their purpose - they helped us understand what's possible with LLMs and gave us initial patterns to build from. But continuing down this path means accepting their limitations, their complexities, their fundamental mismatch with how we think about intelligence.

We need to:

  • Make intelligence patterns our primary building blocks

  • Design tools that match natural cognitive models

  • Let implementation details serve the design, not dictate it

The future of AI development isn't about more complex frameworks or clever abstractions. It's about working with the right primitives - context, decisions, flows. These aren't just implementation patterns - they're the building blocks of intelligence design itself.

What we’re building at Idyllic

At Idyllic, we're exploring how to make these ideas concrete. We believe the future of AI development needs new primitives that match how we naturally think about intelligence. While visual tools are part of the solution, the deeper challenge is creating the right abstractions - ones that let us express intelligence flows as naturally as we imagine them.

We're starting with the fundamentals: rethinking how context flows through systems, how decisions shape that flow, and how intelligence emerges from their composition. Our early experiments suggest that when you get these primitives right, building AI applications becomes more intuitive, more powerful, and more accessible to everyone who has ideas about how intelligence should work.

Intelligence design is emerging as its own discipline, distinct from traditional software development. It needs its own tools, patterns, and ways of thinking. This is what we're researching at Idyllic - starting with a visual grammar that matches how we naturally reason about intelligence.

Our early work compiles to existing frameworks, but that's just the beginning. We're creating a platform where intelligence design becomes concrete, where visual patterns become working systems, where ideas flow naturally from conception to implementation.

If you're interested in shaping how we build with AI, visit https://idyllic.so. Whether you're a developer frustrated with current tools, a designer who sees AI's potential, or someone who believes AI development can be more intuitive - we'd love to hear from you.

Introduction

When ChatGPT launched, I was skeptical. Having spent months with GPT-3's playground writing cheesy AI poetry, I'd seen impressive but ultimately shallow text generation. But ChatGPT was different - instead of cleverly crafting incomplete text to coax out a helpful response, you could just talk to it. The shift from text completion to conversation changed everything. Suddenly our collective imaginations were propelled 25 years into the future abruptly and involuntarily, as if by some alien technology that was made accessible to us, we could envision apps that truly understood us, that could think and reason alongside us. A brave new world to embrace.

We’ve since normalized the use of LLMs — school kids using it everyday for homework, bespoke generated images plastered on every marketing campaign, and a horde of developers becoming AI-augmented cyborgs through tools like Cursor and Windsurf.

Today, developers are drowning in new AI frameworks and tools - LangChain, AutoGen, countless others promising to make AI development easier. But watch any developer build an AI app today and you'll see them wrestling with prompt engineering and complex frameworks, fighting implementation details that feel irrelevant to the intelligence they're trying to create. Despite all our progress, we're still missing something fundamental about how to work with this technology.

What it’s like to build AI apps today

Working with LLMs today — whether you’re building coding assistant, a research tool, or an analysis engine, — forces you to structure everything as a ChatGPT-style conversation. This isn't a coincidence - it's because the underlying APIs were designed around chat interactions. Send a message, get a response. Text input, text output.

[diagram: show OpenAI API spec]

This messaging pattern makes perfect sense for an app like ChatGPT. But building apps with these primitives feel like trying to contort interaction into this format. Want to analyze a document? Turn it into a message. Need to process data? Make it a conversation. Building a coding assistant? More messages. The tools and frameworks all center around prompt chains and message flows, even when what you're building isn't conversational at all.

[image: LangGraph code]

But take something as basic as letting AI “access your documents”. This is a fairly common request - people constantly want AI to help understand their research, find patterns in their writing, connect ideas across their notes. Search online for how to build AI apps and you’ll be met with**: vector databases, embedding pipelines, eval suites, multi-agent systems.** It turns into a full-blown science project of measuring similarity scores, evaluating retrieval quality, instrumenting every piece of the pipeline.

[image: search results for YouTube tutorial]

Is this what we should be focused on when designing a simple system for AI to read and understand documents? Is this the first thing that you should be worried about?

The gap between whiteboard and keyboard

Watch how teams design AI systems — be they technical or not, most people naturally start off speaking the same language when describing what they want AI to do. 'It should understand this context.' 'It needs to make this decision.' 'The information should flow here.' There’s an emerging universal vocabulary for discussing AI capabilities and expectations.


https://x.com/virattt/status/1870513142981353703

Early frameworks like LangChain tried to systematize how we work with LLMs. They experimented with different models of computation - systems of prompts, chains of operations, ways to make AI behavior predictable. Looking under the hood reveals the messy complexity of this approach: carefully crafted prompts begging the model to return valid JSON, fallback strategies when simpler models fail, layers of abstraction hiding an intricate dance of prompt engineering and hopeful prayers.

Today, 2 years after the release of ChatGPT, we have enough real data about what works and what doesn’t. Patterns of best practices have established a growing consensus about what actually matters when building AI apps — and a new discipline is forming around intelligence design. Let’s now imagine a fresh conceptual model — a “language” from first principles, starting with how people naturally think about and work with AI.

It’s time to reimagine how to build things with AI from the ground up.

Context

We can start with the basics — what we can't change: the fundamental API of large language models. At the end of the day, until the large AI research labs give us something new, we're going to be working with models that take text input and produce text output. This is our base layer - our assembly language. All existing LLM frameworks today are built on top of context manipulation, providing the right context to make these calls.

What is context in an AI application? Not just prompt prefixes or vector lookups. Context represents the living, breathing information state that drives intelligence. Every piece of information that matters for the next decision, every nuance that shapes understanding, every detail that influences behavior - that's context.

Context, not prompt engineering

You might recall some buzz about "prompt engineering" - the art of crafting the perfect instructions. What’s we’ve learned is that as models have matured and stabilized, fiddling with exact prompt wording has become less critical. What matters more is prompt parameterization: are we giving the model the right information to work with? Even the best prompt can't help if the model lacks crucial context, yet simple prompts are good enough when the model has the right context to draw from — they can always be tuned to be better. This is our first critical insight.

When an AI application fails, it probably isn’t because the model isn't capable of the task — and this will be even less of a concern given the pace of intelligence increase. Rather, it's often due to a fault in provided context: working with outdated information, lacking important details, or missing relevant background. The intelligence is there; we just need to feed it the right material.

[image: show comparison of prompt engineering vs. context inputs/outputs]

This shift is significant. It's about context management - ensuring the model has the right information at the right time to perform its task. When context becomes your foundation, things become a lot clearer. Every operation either builds context, transforms it, filters it, or uses it.

You can chain these context operations together naturally: filter by recency, augment with related information, limit to the most relevant pieces. Each operation is simple and clear, but together they create sophisticated context manipulation that drives intelligent behavior.

Context explains Memory

"I wish AI would just remember things." This is probably the most common frustration with current AI assistants. You have a long conversation with ChatGPT or Claude, teaching it about your preferences or project, only to have that understanding vanish when you start a new chat. The dream of AI that learns and adapts feels just out of reach.

There are many techniques that are available today to augment AI with memory. Knowledge bases, vector stores, conversation buffers, query memories - frameworks like LangChain offer numerous "memory" types. But these are implementation details, not design patterns. They don't help us think clearly about what memory actually means for AI systems.

Inspect further, and you'll see that memory is really just another form of context management. When we say we want AI to "remember" something, what we actually want is for relevant information to be available when needed. More than simply storing facts, the AI system needs to integrate the persisted data within code structures that implements its worldview.

[diagram: show how LangChain obscures ACTUAL memory context design — conversationalbuffermemory]

A coding assistant stores project structure and connects with the model's understanding of code architecture, design patterns, and best practices. A research aide needs to archive documents as well as surface relationships and insights that align with its understanding of academic knowledge and reasoning. A personal assistant needs to keep track of user activity and preferences and correlate them with human behavior and decisions.

This context-first view explains why good one-size-fits-all memory solutions don't yet exist. Different applications need different types of "memory" because they need to treat context very differently. Each requires its own patterns for storing, updating, and connecting information.

Context explains RAG

If you've been following AI development, you've probably heard of RAG (Retrieval-Augmented Generation). It's become somewhat of a buzzword; it’s not uncommon to see people rushing to set up vector databases, knowledge graphs for AI projects before they know what they’re doing. Underneath that hype, RAG is actually the most intuitive thing in the world — it's just about giving AI relevant context when and where it needs it.

You can think of RAG like building a “context lens” - something that focuses and amplifies relevant bits of information for AI calls. We often overcomplicate this because we're thinking like humans rather than LLMs, assuming we need to carefully format and structure everything. But these models actually tolerate messy, poorly formatted data quite well. They'll often surprise you by amplifying coherence in amazing ways when given the right context.

[diagram: show a RAG pipeline]

This realization frees us from worrying about implementation details too early. How that context eventually gets mapped into messages or API calls? That's an engineering concern. The intelligence design should focus on what context we need and when we need it.

Context explains UI

This context-oriented perspective helps us understand the role of components in AI applications. Take user interfaces - they're not just about capturing explicit inputs and interactions. Now, they also double as context collection points, constantly refining the AI's understanding in real-time. Every user action, every document change, every system event becomes a potential context source. This is what people miss when they dismiss something as "just another GPT wrapper." The magic isn’t in hiding away prompts, but in orchestrating the nuanced capture of activity and constructing context in the background.

[animation: show AI apps like cursor tab]

Look at Cursor, for example. What makes it feel magical isn't just that it calls GPT-4. It's that it's constantly tracking your movement, your browsing, your typing patterns. When you hit tab to complete code and jump around, it’s leveraging rich context about your project, your recent edits, your coding style. The fluid experience comes from smart context management from good engineering, not clever prompts.

Decisions

Beyond context management, there's another fundamental primitive we get from modern AI APIs: tool calling. When you make an LLM call, you can provide a description of available tools - essentially asking the AI to choose from a set of possible actions. The API enforces this through JSON schemas, giving us guaranteed structure in how AI makes these choices.

[image: show tool calling]

This capability has led to the concept of "AI agents" - systems that can use tools to accomplish tasks. But current frameworks make this surprisingly complex. Look at any LangGraph implementation and you'll see a maze of tool nodes, conditional edges, and state management - all just to let AI make and execute choices.

The complexity comes from focusing on the wrong abstraction. We're building infrastructure around tools when what we really want is decisions. When AI uses a tool, what's actually happening? It's making a choice - deciding that some action is appropriate given its current context. The tool calling mechanism is just implementation detail - a way to make that decision executable.

Decisions over tool-calling

This realization points to another fundamental principle of intelligence design: decisions are more natural building blocks than tools. Think about how you describe what you want an AI system to do: "It should decide when to search for more information." "It needs to choose between summarizing or asking for clarification." "It should determine if the answer is complete."

The questions that matter in design become clearer:

  • What decisions does the system need to make?

  • When should these decisions happen?

  • What context should inform each choice?

  • What options should be available?

  • What reasoning, if any, should be captured?

This matches how we naturally think about intelligence. We don't think in terms of tool registration and routing logic - we think about decision points and their implications. The technical details of how those decisions get implemented through tool calls becomes secondary to the core design of the system's decision-making capability.

What choices does your AI system need to make? When does it need to make them? What context informs those choices? These questions lead to clearer designs because they match how we naturally think about intelligence.

In a decision-first design, you map out decision points: "Should we search for more information here? Should we summarize this content? Should we ask for clarification?" Each decision point becomes clear, contextual, purposeful. The technical implementation of how these decisions get executed becomes secondary to the intelligence design.

Understanding Agents

Ask five developers what an "agent" is, and you might get five different answers - each focusing on different implementation details like tool usage, memory systems, or state management. Why? Because we're trying to define agents by their implementation details instead of their decision patterns.

In current frameworks, agents are complex technical constructs: typically a state machine that manages tools, handles memory, routes responses, and coordinates actions. No wonder people are confused! We've buried the essential nature of agency — the ability to make and execute decisions - under layers of technical complexity.

Strip away the hype, and an agent is simply a decision loop: observe, decide, act. The AI receives context, makes a decision, and sometimes executes that decision through tool use. That's it. The implementation details - whether it uses tools, how it manages state, what framework it uses - these become secondary to the core intelligence design.

[image: show a basic decision loop]

Take the classic ReAct pattern: the AI reasons about its situation, chooses an action, observes the result, and repeats. This isn't magical - it's just a structured way to make decisions with feedback. Even "multi-agent systems," despite their complexity, are just multiple decision loops interacting. The sophisticated-looking frameworks and architectures don't actually give you more control over the intelligence design - they just add complexity to this basic pattern.

[react code]

Decision-based Architecture

A simpler architecture emerges when you focus on decisions:

[DIAGRAM: contextualize all points in an agent flow]

Decision Points:

  • What context is needed

  • What choices are available

  • What outcomes are possible

Execution:

  • How decisions become actions

  • How results feed back

  • How to handle failures

State:

  • What context to maintain

  • What history matters

  • How results affect future decisions

This isn't about replacing existing patterns - tools, chains, and routing all serve a purpose at the implementation level. But by thinking in decisions first, we can design AI systems that match how we actually want them to behave.

Experimental: Grammars-of-Thought

Building off decisions as a foundation enables exciting explorations that are hard to conceive of otherwise in the industry’s current frame of chains, agents, and tools.

Decision patterns naturally evolve into a “grammar” for intelligence design. Like any language, these patterns have rules that govern how thoughts combine and flow. The discovery opens up entirely new ways to think about AI systems.

Consider how humans make complex decisions. We don't follow rigid flowcharts - we use flexible patterns of thinking. Sometimes we reason step-by-step (chain-of-thought). Other times we plan then execute. Sometimes we react to new information and adapt. These strategies are cognitive grammar rules, patterns that guide how decisions flow and combine.

[Diagram: Show three parallel decision patterns:

  • Chain-of-thought: A->B->C (linear reasoning)

  • Plan-and-execute: Plan(A,B,C)->Execute

  • React: Observe->Think->Act->Loop]

Current agent frameworks struggle to capture this fluidity. Their rigid graphs of nodes and edges try to map every possible path, missing the fundamental strength of language models: their ability to understand and apply patterns with flexibility and nuance.

What if instead of hardcoding decision trees, we gave our AI systems grammars of thought?

Here's how it might work: A grammar-based approach changes everything. Rather than specifying exact sequences, we define the rules of thought:


Such notation reshapes how we design agent behavior. The system generates valid "sentences" in this decision language, with each pattern rule becoming a composable unit of thought. Want sophisticated behavior? Simply combine grammar rules - perhaps an agent that plans broadly, then uses reactive patterns to handle details.

The magic happens when you translate these grammars into executable agent orchestrations. The system optimizes these decision patterns, pruning redundant steps while preserving the grammar's guarantees. No invalid sequences can emerge, yet the AI retains freedom to operate creatively within these constraints — controllable and expressive. It's like having guardrails for intelligence that don't restrict its power.

[Diagram: Show grammar rules compiling into optimized decision flow, with unnecessary steps eliminated and parallel paths merged]

This approach resolves a core tension in agent design: the balance between control and flexibility. Where traditional frameworks force a choice between rigid graphs or unpredictable behavior, grammar-based decision patterns give you both - the AI can be creative within well-defined patterns of thought.

Metaprogramming

This grammar-based approach scales elegantly with complexity. While traditional agent frameworks buckle under expanding decision trees - drowning in states, edges, and failure points - grammars thrive through composition. Like natural languages expressing infinite ideas through finite rules, adding new capabilities simply means introducing new grammar rules. The underlying patterns remain clean and manageable.

An exciting possibility this unlocks is performant AI metaprogramming. When decision patterns at this level become manipulatable as data, AI systems gain the ability to transform and optimize their own behavior patterns. Agents can learn new grammar rules from trial and error, refine their decision patterns through use, and discover novel combinations of existing patterns. The grammar serves simultaneously as code and data, defining a natural path to recursive self-improvement.

[image: AI modifying its own patterns — A → A’]

With the proper feedback mechanisms, AIs can writing and refining their own playbooks. The agents of today rely on predetermined state machine nodes and edges, it composes and evolves decision patterns optimized for its tasks. In this framework, we can leverage AI intelligence in discovering efficient patterns, combining rules in unexpected ways, and generating novel grammar productions we haven’t thought of.

We're just beginning to explore this territory, but the implications are profound. Just as programming languages evolved from machine code to high-level abstractions, intelligence design could evolve from tool-calling to grammars of thought. The future of AI development might not be about writing decision trees, but about designing cognitive grammars that shape how artificial minds think.

What most AI frameworks get wrong

Here's a counterintuitive truth I've discovered: making intelligence patterns more explicit actually makes AI development more accessible, not less. This sounds wrong until you really think about it. We're learning that with AI, the complexity you need to be focusing on isn't in the implementation - it's in the intelligence design itself. When we obscure important aspects of that design using abstractions intended to simplify implementation, we make everything harder.

Everyone naturally understands how context and decisions work in their own thinking. When you ask someone how they research a topic, they don't talk naturally start with vector databases and embedding pipelines. They describe gathering relevant information, deciding what matters, making connections, synthesizing understanding. These are the real primitives of intelligence - and they're intuitive to everyone.

People complain about the abstractions in current frameworks like LangChain, but the problem isn't that they're too many of them - it's that they’re using these abstractions to hide the wrong things. They expose implementation details while hiding the patterns that actually matter: context flows, decision points, intelligence composition. It’s time we realized that AI apps aren’t just normal apps with AI features sprinkled here and there — that we need a new paradigm for how to designing these applications.

Visual tools for AI

There's no shortage of visual tools for building stuff with AI today. However, upon inspection you'll see they're essentially visual programming tools with AI components - nodes and edges representing code operations. But intelligence design needs its own visual grammar, one that matches how we think about and discuss AI systems.

[image: collage compilation of various AI tools]

Every visual tool encodes a language through its metaphors and patterns. Spreadsheets use cells and formulas to express data relationships. CAD software uses geometric primitives to express physical design. AI, too will needs its own visual language - one that can bridge the gap between whiteboard discussions and working implementations.

Traditional frameworks force us to express AI systems through code - chains, agents, tools. But when you watch people design AI systems, they naturally draw flows of information, decision points, feedback loops. This disconnect between how we think about AI and how we implement it isn't just inconvenient - it's blocking innovation.

The visual tools we have today show us implementation details - arrows connecting nodes that represent code operations, memory buffers, state transitions. But what if our visual language focused on intelligence patterns instead? What if we could see context flowing through our system, watch decision points shape that flow, track how information transforms and combines?

Yes, pretty diagrams make things easier. But it's more about having a visual grammar that matches how we think about intelligence. When you can see context building up, when you can trace decision flows, when you can watch information transform - you start to spot patterns you'd miss in code. You see opportunities for composition that implementation details obscure.

Intelligence Design is Visual

This visual approach to intelligence design points to a broader shift needed in AI development. The frameworks we have today served their purpose - they helped us understand what's possible with LLMs and gave us initial patterns to build from. But continuing down this path means accepting their limitations, their complexities, their fundamental mismatch with how we think about intelligence.

We need to:

  • Make intelligence patterns our primary building blocks

  • Design tools that match natural cognitive models

  • Let implementation details serve the design, not dictate it

The future of AI development isn't about more complex frameworks or clever abstractions. It's about working with the right primitives - context, decisions, flows. These aren't just implementation patterns - they're the building blocks of intelligence design itself.

What we’re building at Idyllic

At Idyllic, we're exploring how to make these ideas concrete. We believe the future of AI development needs new primitives that match how we naturally think about intelligence. While visual tools are part of the solution, the deeper challenge is creating the right abstractions - ones that let us express intelligence flows as naturally as we imagine them.

We're starting with the fundamentals: rethinking how context flows through systems, how decisions shape that flow, and how intelligence emerges from their composition. Our early experiments suggest that when you get these primitives right, building AI applications becomes more intuitive, more powerful, and more accessible to everyone who has ideas about how intelligence should work.

Intelligence design is emerging as its own discipline, distinct from traditional software development. It needs its own tools, patterns, and ways of thinking. This is what we're researching at Idyllic - starting with a visual grammar that matches how we naturally reason about intelligence.

Our early work compiles to existing frameworks, but that's just the beginning. We're creating a platform where intelligence design becomes concrete, where visual patterns become working systems, where ideas flow naturally from conception to implementation.

If you're interested in shaping how we build with AI, visit https://idyllic.so. Whether you're a developer frustrated with current tools, a designer who sees AI's potential, or someone who believes AI development can be more intuitive - we'd love to hear from you.

Subscribe to updates

Subscribe to updates

Subscribe to updates