← Notes
Drawdown as a throttle, not a brake
risk·long-portfolio·macro-regime·research

Drawdown as a throttle, not a brake

Why the long sleeve cuts risk at −8% — well before the −15% target band — and why the cut is partial rather than total. The math, the philosophy, and the edge cases that broke our first three designs.

· Mikhail Savchenko

The long-sleeve target drawdown band is −10 to −15%. The drawdown throttle fires at −8%. The first thing every operator notices when they read the spec is that those two numbers don’t agree, and they want to know why we’d cut risk before we reach our own tolerance.

The answer is that the band is what we’re willing to accept across the cycle, and the throttle is what we do at any given moment to keep the band intact. Acting at the band itself is acting too late. The throttle is the thing that makes the band achievable — without it, the band is a hope, not a budget.

Our first throttle was a hard switch and it worked beautifully in backtest. Equity below −12% from the rolling peak meant: cut risk sleeves to half-weight, hold there until drawdown closed back to −6%, then re-engage. The problem was that backtest doesn’t have path-dependence, and production does. A slow grind from −9% to −12%, where the regime is genuinely deteriorating, was being treated identically to a single fast leg down, where the regime hasn’t actually changed and a mean-reversion is more likely than a continuation. The first case wants risk cut. The second case wants risk held — possibly added to. A binary switch on level can’t distinguish.

Re-engagement timing was the second thing that broke. Cutting at −12% and re-engaging at −6% means the sleeve sits half-sized through the entire bear-to-bull transition, which is exactly where most of the recovery happens. We measured the cost on three years of live

  • backtest data; it was roughly 220bps of expected annual return sitting on the floor of a quirk in where the threshold was set.

The third thing was a discontinuity problem we should have seen coming. A binary switch creates a discontinuity in allocation at the trigger price. Around the threshold, tiny price noise generates whole-sleeve flips in and out. Turnover spiked, transaction costs ate the savings, and the sleeve started to feel jittery in a way no amount of operator reassurance could fix.

The current throttle replaces all three failure modes with the same move: it makes the response continuous. The trigger reads from a 7-day exponentially-weighted moving average of the peak-to-trough distance rather than the spot drawdown, which kills the path-dependence — a single fast leg down doesn’t trip the throttle until the drawdown sustains. The cut itself scales linearly: once the smoothed drawdown breaches −8%, the risk sleeves (BTC, SPY, GLD) scale toward their bear-regime floor as the smoothed drawdown deepens. At −8% you’ve cut by ~25%. At −12%, ~75%. At −15% — the band edge — you’re at floor. SHY, the bond floor, absorbs the displacement. Re-engagement is the same function in reverse: as the smoothed drawdown closes back toward zero, the cut unwinds on the same gradient. There is no “wait for −6%” gate.

The whole thing is one differentiable function mapped from smoothed drawdown to a throttle multiplier between 0.0 (full bear floor) and 1.0 (no cut). The multiplier is applied to the regime-conditional sleeve weight after the regime gate has already set it, so the throttle compounds with the regime rather than fighting it. A bear regime plus a deep drawdown ends at the bear floor; a bear regime plus a shallow drawdown stays at bear-regime weight, because the throttle multiplier is near 1 above −8%.

The choice of −8% deserves its own paragraph because it’s the bit that gets second-guessed most often. The throttle has to fire before the band, not at it; if it fires at the band, the tolerance budget is already spent. We picked −8% because a smoothed drawdown of that depth has historically (1990 onward, across BTC and SPY composite history) been the level past which the conditional probability of the next 5% move being further down exceeds 50%. Above −8% the next 5% is roughly a coin flip. Below −8% it tilts toward further down by 4-7 percentage points depending on the regime. Acting on a tilt is what the throttle is for. The exact threshold is re-fit yearly; it has been −7 to −9% every year for the last decade and we don’t expect that to stop being true any time soon.

A couple of design choices around the throttle were not obvious going in. One is that when the throttle cuts, the freed weight goes to SHY rather than to cash. SHY has a non-zero expected return, and getting the sleeve back into risk from cash carries a transaction-cost step-function we can avoid by parking the rotated weight in a liquid bond proxy. Cash sounds safer; in this kind of system it’s actually worse. Another is that the throttle is strictly reactive — it reads what already happened. We tested predictive throttles (“forecast a drawdown, cut ahead of time”) for two quarters. They cut too often, missed the recoveries, and the prediction error overwhelmed the saved drawdown. A good reactive throttle beats a bad predictive one by a wide margin.

The throttle also doesn’t override the regime model — it compounds with it, multiplicatively. If regime is in bear, weights are already low; the throttle then multiplies those low weights down further toward the bear floor. There is no scenario where the throttle pulls the sleeve above its regime ceiling, and there is no scenario where the regime forces the sleeve above the throttle’s floor. The two systems are commutative in this sense, and that’s not an accident — it took a rewrite to get there, after one quarter of regime and throttle fighting each other.

What we still want to fix is the symmetry across sleeves. The current throttle treats BTC and SPY drawdowns as a single composite, which is mostly fine when both are correlated and badly wrong when one of them is leading. A BTC-led drawdown should cut BTC harder and leave SPY closer to its band; a SPY-led drawdown is the opposite. The implementation is straightforward — a sleeve-conditional multiplier — but the calibration is the hard part, because sleeve-specific historical drawdown distributions are sparser than the composite. We’re collecting another quarter of data and will post the results when the calibration converges.

— inite team

Related notes
All notes →