Published on

Practical Insights on AI Agent Frameworks and Orchestration

Authors
  • avatar
    Name
    Xiaoyi Zhu
    Twitter

"Environment is more important than experience." — Luo Fuli

I recently listened to a long form Chinese podcast featuring Luo Fuli, who was leading large model work at Xiaomi at the time. She talked through her hands on experience with OpenClaw and what it suggests about where AI engineering may be heading.

I do not want to pretend this post is a definitive framework. It is more like a set of notes I wrote for myself after the episode. The field is moving faster than I can comfortably keep up with, and writing is one way I force myself to slow down, understand what other people are seeing, and update how I work.

The conversation covered a lot, but the parts that stayed with me were practical: orchestration matters, memory matters, cost matters, and the surrounding system may matter as much as the model inside it.

The agent that changed how she worked

Luo’s first encounter with OpenClaw happened late one night during Lunar New Year. She installed it out of curiosity and ended up interacting with it until dawn. It was not flawless, but it felt unusually coherent. Within a few days, her relationship with the tool had shifted:

  • Day 1: The framework paid attention to contextual cues, like quietly appending the current time to each turn. That small detail made the conversation feel grounded.
  • Day 2: She started using it for parts of her daily work: team management questions, evaluating hiring candidates, and thinking through strategic problems. She began referring to it as a “digital clone.”
  • Day 3: It advanced her research by building a User Agent to simulate multi turn interactions for post training data generation.

What stood out to her was not only the underlying model. It was how the framework coordinated multiple models. It could route vision, reasoning, and code tasks to whichever model handled each part best, which helped compensate for individual weaknesses. She called the design “very, very clever” because it was built on models that were not necessarily at the frontier. The orchestration layer let a 3B parameter model perform well beyond what I would have expected from the model alone.

That point connects strongly with what I have been learning from building my own agent harnesses. I used to think mostly in terms of model capability. Now I am trying to think more in terms of the environment around the model: what context it sees, what tools it can use, how work is routed, and where the system stops it from drifting.

Memory and context as the quiet differentiator

Luo emphasized that an agent framework’s effectiveness depends heavily on how it manages context. OpenClaw’s memory system goes beyond a linear chat history. It maintains persistent memory across sessions, uses skill folders to store reusable prompts and workflows, and injects environmental signals like the current time into conversations.

None of this sounds flashy. But it is often the difference between a tool I experiment with once and a tool I start trusting for real work.

For developers evaluating agent frameworks, this feels like a useful question to ask early. Does the system retain context across sessions? Can I store project conventions and have them applied consistently? Can the agent remember the shape of my workflow, not just the last few messages in a chat? Without that, I suspect many systems hit a ceiling long before the model’s theoretical capability is reached.

Open source as a practical lever

Because OpenClaw is open source, Luo and her team could modify it directly. They sometimes used a top tier model to redesign the architecture, then swapped in smaller and cheaper models for daily use. That flexibility let them experiment with memory structures, scheduling, and multiple agent configurations in ways that would be hard to test inside a closed product.

The broader lesson is simple but important to me. With a closed agent tool, most of my control comes from prompting. With an open source one, I can adjust the internal logic. That matters when the use case does not fit the standard coding assistant template.

The community effect also matters. Shared improvements can shorten feedback cycles, and in a field this young, faster learning loops are valuable. I do not think every tool needs to be open source to be useful. Claude Code is still a great product for individual work. But when I want to understand and reshape the agent loop itself, open source gives me a different kind of leverage.

The framework, not just the model, drives the outcome

Luo’s line, “Environment is more important than experience,” points to a shift in how skill develops in the current AI paradigm. Her view is that working effectively with agent systems can be learned quickly if someone is in an environment that iterates fast and maintains high standards.

That also changes how I think about model selection. She suggested that a mid tier model paired with a well designed agent framework can handle roughly 85% of tasks comparably to a frontier model. The remaining edge cases, like writing and debugging a custom CUDA kernel against actual training efficiency, still need top tier capability. But for a lot of automation and orchestration work, the framework can close much of the gap.

Her phrase that stuck with me was: the architecture becomes the product. The model is still important, of course. Better models make everything easier. But the product experience often comes from the whole system around the model: routing, memory, retrieval, cost control, verification, fallback behavior, and the way the agent fits into an existing workflow.

A practical corollary: when building with AI, the question is not only “which model should I use?” It is also “how should the system divide work, preserve context, recover from failure, and fit into the product that already exists?” I am still learning how to answer that well.

Cost, speed, and the replacement coefficient

Cost was a recurring theme. Luo offered a simple heuristic: if an API call costs 10 yuan but saves 1,000 yuan in labor, adoption is straightforward. If the cost approaches the value of the labor it replaces, the incentive disappears.

That is why efficiency matters at multiple layers. Model architecture matters. Agent design matters. Routing simple tasks to smaller models matters. All of them serve the same goal: keeping the system cheap enough that using it makes practical sense.

For developers, this means tracking not just accuracy but what she called the “replacement coefficient,” meaning how much human time the system actually saves per dollar spent. That ratio can matter more than benchmark scores when deciding whether a tool belongs in a real workflow.

What I’m adjusting in my own work

The conversation did not hand me a checklist, but it did shift where I want to focus. A few adjustments I am making:

  • Pay more attention to the orchestration layer. Luo’s point that “the architecture becomes the product” feels increasingly true to me. I want to spend more time designing the overall orchestration before jumping into implementation: how a product or feature fits into the existing framework, what parts should be deterministic, where the agent should have freedom, and where the system needs guardrails. As frontier models get more powerful, implementation is becoming less of the hard part. The harder and more durable work is deciding the shape of the system so we can ship faster, maintain it more easily, and avoid building one off agent flows that are clever but hard to operate.
  • Treat memory as a requirement, not a feature. If an agent does not carry project conventions forward between sessions, it is hard to build trust around it. Persistent context feels foundational.
  • Evaluate open source frameworks for customizability. When a use case falls outside standard patterns, being able to adjust how an agent plans or retrieves information is a practical advantage. Polished closed tools are convenient, but they also lock me into someone else’s assumptions.
  • Track cost as a design constraint. I am starting to treat token usage and latency as important product metrics. If an automation speeds up a workflow but costs more in API fees than the time it saves, it is not really an improvement.

A snapshot, not a roadmap

Luo noted that she does not read many papers anymore and that some of her current views will likely evolve. She runs her own experiments and trusts iteration over theory. I found that refreshing. In a field where the baseline keeps moving, a lot of useful knowledge comes from people sharing what is working for them right now, even if the conclusions are temporary.

So I am treating this episode as a snapshot, not a roadmap. It captures one team’s working intuitions in early 2026. It does not answer every question, and I am sure some of the ideas will age quickly. But the practical thread feels worth remembering: agent frameworks are becoming the active layer, memory and context drive reliability, and cost decides whether adoption is real or just a demo.

The AI paradigm has already shifted. Writing this down is one small way I am trying to keep up.

Source: Luo Fuli: OpenClaw, Agent Frameworks — The AI Paradigm Has Already Changed Dramatically!