The Best Agent Harness for Algorithmic Trading
Why my trading bot generator BEATS Claude Code and Cursor
To those sick of the LinkedIn Techfluencers posting about the latest “innovation” in AI (which is really a markdown file bolted onto an API), you might be annoyed when you hear that there’s a new buzzword in AI. Agent harness…. Why does THIS matter?
For real-world AI agents, they are critical.
There’s a reason why you can use one model in the Chat UI and the same model in Cursor and it produces vastly different results. Agent harnesses are the glue that transform large language models into living, intelligent beings capable of operating autonomously.
More concretely, there’s a reason why Claude, ChatGPT, Cursor, and OpenClaw fail to outperform NexusTrade when it comes to developing algorithmic trading strategies. It’s the harness.
Listen. I know a lot about AI agents. So much so that OpenAI literally gave me a unique physical “token” representing the 10 billion tokens I’ve processed through their API.
Press enter or click to view image in full size A “token of appreciation” from OpenAI. It says “Austin Starks — Honored for passing 10 billion tokens”Let me show you what an award-winning agent harness looks like.
First, ask ChatGPT to backtest a trading strategy
Before I get into the article, I want to demonstrate why this matters at all.
Go to ChatGPT, or Claude, or Gemini, or whatever is the best model at the time you’re reading this, and ask it to do something every trader wants to do…
Backtest a trading strategy.
Press enter or click to view image in full size A lecture from ChatGPT about how it would backtest this strategy if it could (which it can’t)backtest an option trading strategy that buys and holds MAG7 calls
The mainstream models can’t do it. Not Claude Code, Cursor, or Gemini. It’s not because they’re too “dumb”, but because they literally don’t have the ability.
They don’t have the licenses to access the terabytes of data. They don’t maintain the infra so that you can run a backtest. And remembering to account for split adjustments, dividends, anomalous data events, and writing highly scalable data access patterns is quite frankly too complicated for OpenClaw to build it out for you in a reliable, secure way.
You need something custom-built.
The custom-built glue around the agent is called the harness. Simply put, an agent harness is the infrastructure layer that wraps around an AI agent to manage its execution lifecycle. This includes:
- The list of tools the agent has access to
- How an agent manages its context-window for long-term conversations
- How the agent handles errors and retries
- What models the agents should use for specific tasks
- Spawning subagents
- Observability, monitoring, and alerting
In one sentence, it’s the opinionated software stack that gives a large language model its agentic capabilities. Here’s an example of a production-ready agent harness for developing algorithmic trading strategies.
Orchestrating USEFUL tools for automated trading
An AI agent’s tools are what separates the degenerates shitposting on twitter from the quants performing real research. They’re important. Pay attention.
A tool is simply a way for an agent to interact with an external service. It can be as simple as an API call or a cli command. They are critical for allowing the agent to make observations in the environment.
A good tool isn’t a built-in add-on. Searching the web and browsing twitter. That’s rookie shit. Anybody can build an agent that does it.
No. A good tool is purpose-built and well thought out. It’s things like querying a DuckDB database for historical options chains because after your benchmark results, you’ve seen it outperform BigQuery and Postgres 10 to 1. It’s an API call to a backtesting engine, which has already loaded hundreds of gigabytes of historical data onto disk, so that when you launch a backtest, the system knows how to iterate through the millions of datapoints thanks to your cleverly designed mmap iteration system.
It’s NOT a random API call.
Beyond that, the harness should have a built-in system for managing memory, iteration limits, and deciding which models should be used. These trade-offs matter; some menial tasks might require a quick model while the true reasoning requires a powerhouse. Which model is needed and when when is not innate knowledge for the AI; it’s something you have to teach them.
As a concrete example, Aurora is the AI agent that powers NexusTrade.
In addition to the basic tools like Web Search and Deep Research, Aurora has access to specialized tools that connect directly to NexusTrade. This includes:
- Searching for relevant news about ANY and ALL stock tickers… not just what’s happening now, but what HAS happened in the past
- The ability to take a sentence and generate a trading strategy configuration… completely eliminating the need of an error-prone programming language
- Running historical simulations and using those simulations to improve the strategy parameters
- Launching, halting, and updating real and simulated portfolios
- Explaining past orders, analyzing portfolios, and even generating images for my Medium readers to enjoy
Most importantly, adding a new tool is minutes of real work. You can ask OpenClaw to build a trading strategy, but can you get an accurate report of how good the strategy has been in the past?
Not without REAL, useful tools.
Observing the agent’s inner thought process
Nobody cares about “observability” until your agent burns $200/hour in an infinite loop. Then all of a sudden, it’s the sexiest, most important thing ever.
With a good agent harness, observability comes first.
Press enter or click to view image in full size An observability dashboard for this particular agent run. It shows how many research tokens were used, how many models were called, and the execution timeWhen we launch an agent, we can inspect every single thing the agent does down to the model it chose and the thought process for the tool. This makes finding bugs and performance bottlenecks trivial, and steering the agent to accomplish our goal is now straightforward. We KNOW why it decided to use the screener instead of the strategy generator. Even if we don’t agree, we can at least, understand, right?
The End Result
The end result of combining useful tools, observability code, summarization rules, and a ReAct agentic loop is a powerful AI agent that can perform tasks autonomously.
Press enter or click to view image in full size Using NexusTrade’s Aurora to build and backtest an options trading strategyLet me walk through an example.
Unlike ChatGPT, which gave us a long-winded answer about what it would do if it could, Aurora takes action. It immediately creates a plan to test several different trading strategies.
With the agent harness that’s implemented, the user has maximum control. In semi-automated mode the plan has to be approved. If rejected, the AI makes modifications. If accepted, the agent gets right to work and starts executing. For a plan to create a strategy, it immediately created one simple portfolio.
Press enter or click to view image in full size Aurora created a trading strategy, which can be used for simulations and real-time tradingIt then tested this portfolio across multiple time periods. It saw extreme volatility; insane gains and devastating losses.
Press enter or click to view image in full size Two portfolios displayed by the AI. One has a 99.4% drop and the other has a 264% return with a 95% maximum drawdown. They’re insanely riskyGet this. Not only does the agent recognize how risky this strategy is, but it decided to launch specialized subagents to fix it. Three different AI agents are working in parallel to create a better version of our trading strategy.
Press enter or click to view image in full size Launching subagents to explore different option trading configurationsThese subagents are even more purpose-built for options trading. It has explicit instructions and hints to use tools such as the AI Stock Screener, which can analyze options chains and help us inform our trading strategy.
Press enter or click to view image in full size One of the subagents generated a chart to help us visualize Apple’s options chain. This can help inform the rules for the trading strategyThe end result is each subagent developed its own unique strategies, evaluated them, and then the parent prompt can examine each subagent and get insights on every run.
Press enter or click to view image in full size After the subagents finish, the agent explicitly reads all of their results and injects key insights into the sessionThis is both implicit and explicit. For a particular run, we see in the UI how the parent is reading the subagents. But it’s also implicit. As more and more agents run and accumulate experiences, the backend will generate a mapping of strategy ideas to performance metrics. Then, when we try to create similar strategies in the future, the backend will know this and auto-inject some winning ideas to make iteration faster and closer to proven winning solutions.
This, in essence, creates a network effect — the more agents we create, the better it gets. The better it gets, the more people will decide to use it and explore other ideas. This compounds into a flywheel that everybody that uses NexusTrade benefits from.
The end result of this process is a set of trading strategies. We know exactly how they’ve fared in the past and can deploy them for real-time paper-trading.