I want to be clear about what this article is and what it is not. This is not a "how to get rich with AI trading bots" piece. It is not a tutorial on which signals to trade or how to beat the market. It is an honest engineering post-mortem on a system we built internally, what broke, what we learned, and what the system actually does today โ€” which is meaningfully different from what we originally designed it to do.

We built this because manual portfolio monitoring was taking an unreasonable amount of time and producing inconsistent decisions. Not just my time โ€” the cognitive overhead of tracking positions, watching for rebalancing signals, and responding to market anomalies was creating a constant background drain on focus. The goal was not to get rich faster. The goal was to free up attention by automating the monitoring and execution of a well-defined strategy, so that the strategy would be followed consistently regardless of whether I was in a meeting, asleep, or focused on client work.

๐Ÿ’ก
What This Is

This is an internal DevThing lab project, built to sharpen our understanding of autonomous AI systems operating in high-stakes, real-time environments. The lessons here directly inform how we design AI agents for clients in finance, operations, and decision support. We publish it because honest engineering narratives are more useful than polished case studies.

The Problem We Were Solving

The specific pain point was this: a diversified portfolio across equities, ETFs, and fixed income requires periodic rebalancing, and that rebalancing should happen based on conditions, not calendars. Quarterly rebalancing on a fixed schedule ignores the actual state of the market. You want to rebalance when allocations drift beyond tolerance, when volatility regimes shift, or when specific signals indicate a change in your thesis. But monitoring for those conditions continuously, while running a consulting business, is not feasible as a manual process.

The secondary problem was discipline. Manual trading introduces behavioral biases. You hold losers too long because you don't want to admit the loss. You take profits too early on winners because gains feel good. You panic in drawdowns. Every one of those behaviors costs money over time, and every one of them is eliminated when a system executes rules mechanically. The bot doesn't feel anything about the positions it holds. That is its primary advantage.

Architecture Decisions

The system has four layers. The data feed layer ingests market data from multiple providers โ€” price feeds, options flow, volume, volatility surfaces, and macro indicators. We use multiple providers with cross-validation because a single data feed with an error will cause a model trained to trust it to execute bad trades with full confidence. Feed reliability is not optional.

The signal generation layer is where most of the interesting engineering lives. We run a set of signal generators โ€” trend-following, mean-reversion, volatility breakout, and a macro regime classifier โ€” and combine their outputs through a weighting model. The weighting model is not static; it adjusts based on which signals have been predictive in the current market regime. This is one of the places we got burned early, which I will come to in a moment.

The risk management layer runs before every execution decision. It enforces position limits, maximum drawdown thresholds, correlation limits (so the portfolio does not become accidentally concentrated in a single macro bet expressed through different instruments), and โ€” critically โ€” circuit breakers that halt all new position-opening during specific market conditions. The circuit breakers were not in the original design. We added them after an incident that I will describe honestly.

The execution layer routes orders through broker APIs with time-in-force logic, slippage limits, and retry logic for failed executions. It also maintains a complete audit trail of every decision, the signals that drove it, and the execution outcome. This audit trail is the most valuable part of the entire system โ€” not for compliance (this is an internal system), but for diagnosing what went wrong when something goes wrong.

The Hard Lessons

Overfitting is not an abstract risk. It is a certainty if you are not actively fighting it. The first version of our signal weighting model was trained on three years of historical data and backtested beautifully. In live operation, it underperformed a simple index buy-and-hold by a meaningful margin for the first four months. When we dug into why, we found that the model had learned to be very good at explaining the historical data it was trained on, including some patterns that were artifacts of the specific period rather than durable market dynamics. Every backtest result that looks too good should be treated as a red flag, not a green light. We now run walk-forward validation as a matter of course and intentionally limit model complexity.

Flash crash behavior is a real problem for automated systems. In our third month of live operation, we experienced what we internally called the "cascade event." A micro-cap position moved sharply on low volume in a way our volatility model classified as a valid breakout signal. The system added to the position. The position moved further, the model added more, and within about forty minutes we had a meaningful allocation concentrated in something that was behaving like a pump-and-dump on a thin market. We had no circuit breaker for "adding to a position that has moved more than X% in the past Y minutes." That circuit breaker now exists. The position was eventually closed at a loss. The loss was not catastrophic but was entirely avoidable.

โš 
The Circuit Breaker Lesson

Every autonomous system operating in a real-time environment with real stakes needs circuit breakers that halt execution under defined adverse conditions. This is not a nice-to-have. It is the most important safety feature in the architecture. Our circuit breakers now halt new position-opening when: daily drawdown exceeds a threshold, position velocity exceeds a threshold, market-wide volatility exceeds a threshold, or any data feed anomaly is detected. Design the circuit breakers before you deploy the rest of the system.

Slippage compounds in ways that kill strategies that look viable in backtesting. A strategy that works on closing prices in a backtest will often fail in live execution because real orders move markets, especially in less liquid instruments. We learned to run live paper trading for at least six weeks on any new strategy before committing capital, specifically to measure actual slippage versus modeled slippage. The gap is usually unflattering.

The dynamic signal weighting was too clever. Our adaptive weighting model, which adjusted signal weights based on recent predictiveness, introduced a feedback loop we had not anticipated. In trending markets, it would increase the weight on trend-following signals, causing the portfolio to become increasingly directional at exactly the moment when trend-following positions were most extended and most vulnerable to reversal. We reverted to simpler, more static weightings. The performance was slightly lower in trending markets but meaningfully better in choppy markets where the adaptive model had been hurting us most.

What It Actually Does Now

The current version of the system is considerably more conservative than the original design. It operates on a defined universe of liquid instruments โ€” index ETFs, sector ETFs, and a small selection of individual positions with genuine fundamental bases. It does not trade options, does not short individual names, and does not operate in pre-market or after-hours sessions.

Every morning it generates a daily P&L summary and sends it to a dashboard. It monitors all open positions against target allocations and flags any that have drifted beyond tolerance thresholds. It executes rebalancing trades automatically when conditions are met, within the risk limits the system enforces. It sends anomaly alerts when any position or market condition falls outside normal parameters โ€” not to require immediate action, but to ensure the state of the portfolio is visible.

~3 hrs/week
active attention previously required for manual monitoring and execution โ€” now reduced to roughly 15 minutes reviewing daily summaries
Measured over 12 months of operation

Position sizing is fully automated, using a Kelly-fraction based approach with a conservative fractional multiplier. The system will not open a position that would make the portfolio meaningfully more correlated, will not add to a losing position beyond defined limits, and will not hold any single position beyond a maximum allocation regardless of how strong the signal is.

What We Would Do Differently

Build the circuit breakers first. Seriously โ€” before you write a single line of signal generation code, design and implement every circuit breaker you can imagine needing. Then add more. The signal generation is the interesting work; the circuit breakers are the engineering discipline that keeps the interesting work from becoming expensive.

Start with paper trading and stay there longer than feels necessary. The temptation to move to live capital is strong once backtests look good. Resist it. Paper trading reveals execution realities that backtests never will โ€” specifically slippage, partial fills, and the gap between the price you see and the price you get.

Simpler models tend to be more durable. Every time we increased model complexity in pursuit of better backtest performance, we got worse live performance. The models that have aged best are the ones we could explain simply and that rely on durable market dynamics rather than historical pattern-matching.

The audit trail is the feature. Build it first, make it comprehensive, and make sure it is available for human review at any time. When something goes wrong โ€” and something will go wrong โ€” the audit trail is the only tool you have for understanding why. We have learned more from reviewing our own audit logs than from any other source.

"Autonomous systems operating with real stakes require more discipline in their constraints than in their capabilities. The circuit breaker that prevents a bad trade is worth more than the signal that finds a good one."

Fred Lackey, DevThing LLC

The system works. It does what it was designed to do: it frees up cognitive overhead, it executes strategy consistently without behavioral bias, and it keeps the portfolio within its intended risk parameters. What it does not do โ€” and was never designed to do โ€” is beat the market through clever signal discovery. That is a much harder problem, and most of the people claiming to solve it are selling something.