Options Backtesting Metrics Explained: Sharpe Ratio, Max Drawdown & More

You've run a backtest. Numbers fill the screen: Sharpe Ratio 1.23, Max Drawdown -18.4%, Profit Factor 1.67, CAGR 8.2%. But what do these numbers *mean*? Are they good? Bad? Should you trade this strategy or not?

Most backtesting tutorials skip the most important part: how to read the results. They show you how to set up the test but leave you staring at a dashboard of metrics you don't fully understand.

This guide explains every metric that OptionsPilot's backtester displays, what "good" looks like for each one, and how to use them together to make informed trading decisions.

---

The Metrics Dashboard: A Complete Reference

When you run a backtest on OptionsPilot, you'll see these metrics. Let's break down each one.

---

1. Win Rate

What it measures: The percentage of trades that were profitable.

Formula: (Winning Trades ÷ Total Trades) × 100

Example: 95 winning trades out of 120 total = 79.2% win rate

Why different strategies have different benchmarks: Premium selling strategies (iron condors, credit spreads) are designed to win frequently. If they don't, something is wrong. Premium buying strategies (long straddles, debit spreads) are designed to have large winners that compensate for frequent small losses.

Common mistake: Judging all strategies by the same win rate benchmark. A 45% win rate on a long straddle is excellent. A 45% win rate on an iron condor is a disaster.

OptionsPilot tip: Win rate is displayed prominently in the results panel, but always check it alongside expectancy and profit factor.

---

2. Sharpe Ratio

What it measures: Risk-adjusted return. How much return you're earning per unit of risk (volatility).

Formula: (Strategy Return - Risk-Free Rate) ÷ Standard Deviation of Returns

In plain English: If two strategies both return 10% per year, but one has wild swings and the other is steady, the steady one has a higher Sharpe ratio. It's earning the same return with less risk.

RatingValueInterpretation Bad<0.5Return doesn't justify the risk Average0.5–1.0Acceptable for most strategies Good1.0–2.0Strong risk-adjusted performance | Excellent | >2.0 | Exceptional (rare for options) |

What a Sharpe of 1.0 means: For every unit of volatility (risk) you take on, you earn 1 unit of return above the risk-free rate. This is generally considered the minimum threshold for a strategy worth trading.

What a Sharpe of 2.0 means: Exceptional risk-adjusted returns. Hedge funds with sustained Sharpe ratios above 2.0 are considered elite. For a retail options strategy, this usually indicates either genuine edge or insufficient data (too short a test period).

Context for options strategies:

Options strategies tend to have distorted return distributions (many small wins, occasional large losses for premium selling; many small losses, occasional large wins for buying). This means the Sharpe ratio can be misleading:

Premium selling strategies often show artificially high Sharpe ratios during calm periods because volatility of returns is low. Then a crash hits and the Sharpe collapses.

Premium buying strategies often show lower Sharpe ratios because returns are lumpy. But those lumps can be very profitable.

Best practice: Evaluate Sharpe ratio over a period that includes at least one major drawdown event. A Sharpe calculated from 2016–2019 is nearly meaningless. One calculated from 2015–2025 (including COVID) is much more informative.

---

3. Max Drawdown

What it measures: The largest peak-to-trough decline in your account value during the backtest period.

Formula: (Trough Value - Peak Value) ÷ Peak Value × 100

Example: Your account grew to $15,000 (peak) then dropped to $12,000 (trough). Max drawdown = ($12,000 - $15,000) ÷ $15,000 = -20%.

Why max drawdown is arguably the most important metric:

Every trader *thinks* they can handle a 30% drawdown until they're actually in one. At -30%, you need a +43% gain just to break even. At -50%, you need +100%.

Max drawdown tells you the worst period you'll need to endure if you trade this strategy. If you can't psychologically handle the max drawdown, the strategy's return doesn't matter — you'll quit during the drawdown and lock in losses.

What to look for in the equity curve:

When you see the equity curve on OptionsPilot's results page, pay special attention to the drawdown periods:

How long did recovery take? A -15% drawdown that recovers in 2 months is very different from one that takes 12 months.

How many drawdowns occurred? A strategy with one bad drawdown and steady returns otherwise is different from one with constant drawdowns.

When did drawdowns happen? If every drawdown corresponds to a VIX spike (2018, 2020, 2022), you can consider adding VIX filters to reduce them.

---

4. Profit Factor

What it measures: The ratio of gross profits to gross losses. How many dollars you made for every dollar you lost.

Formula: Sum of All Winning Trades ÷ |Sum of All Losing Trades|

Example: Total winning trades = $18,000, total losing trades = $12,000. Profit factor = 18,000 ÷ 12,000 = 1.50.

Why profit factor is more useful than win rate:

Profit factor combines win rate AND win/loss magnitude into a single number. A strategy with a 90% win rate and a 0.9 profit factor is losing money. A strategy with a 40% win rate and a 2.5 profit factor is crushing it.

Profit Factor = 1.0 → Breakeven (you're making exactly what you're losing)

Profit Factor = 1.5 → For every $1 lost, you make $1.50. Solid.

Profit Factor = 2.0 → For every $1 lost, you make $2. Strong.

Benchmark by strategy type:

| Strategy | Typical Profit Factor | Iron Condor1.3–1.8 Vertical Spread1.2–1.6 Covered Call1.3–1.7 Cash Secured Put1.3–1.8 Long Straddle1.1–1.8 Butterfly1.4–2.5

---

5. CAGR (Compound Annual Growth Rate)

What it measures: Your annualized return, accounting for compounding.

Formula: (Ending Value ÷ Beginning Value)^(1/Years) - 1

Example: $10,000 grew to $16,500 over 10 years. CAGR = ($16,500 / $10,000)^(1/10) - 1 = 5.13%.

RatingValueContext Bad<2%Below risk-free rate Average2–5%Modest, typical for conservative strategies Good5–15%Strong for defined-risk options | Excellent | >15% | Exceptional (verify it's not over-fit) |

Important context: SPY itself has averaged roughly 10% CAGR over long periods. An options strategy returning 5% CAGR on *allocated capital* might seem low — but remember:

Most options strategies only use 10–30% of your total capital per trade

The unallocated capital can be invested elsewhere

A 5% return on 20% of capital + 8% return on the other 80% = 7.4% blended return with lower overall risk

Red flag: If your backtest shows >25% CAGR, be suspicious. Either the strategy is taking extreme risk (check max drawdown), the test period is too short, or the parameters are over-optimized.

---

6. Expectancy (Average P&L Per Trade)

What it measures: The average dollar amount you can expect to make (or lose) on each trade.

Formula: (Win Rate × Average Win) - (Loss Rate × Average Loss)

Example: 80% win rate, $150 avg win, $400 avg loss. Expectancy = (0.80 × $150) - (0.20 × $400) = $120 - $80 = $40 per trade.

| Rating | Value | Interpretation | BadNegativeLosing money per trade Breakeven$0No edge Average$1–$30Marginal Good$30–$100Solid edge Excellent>$100Strong edge

Why expectancy is the king of metrics:

Expectancy tells you the cold, hard truth: on average, is each trade making or losing money? It strips away the emotional distortion of win rate and gives you a single number.

If your expectancy is positive, you have an edge. If it's negative, you don't — no matter how impressive your win rate looks.

How to use expectancy for position sizing:

Once you know your expectancy and trade frequency, you can estimate annual returns:

> Estimated Annual P&L = Expectancy × Trades Per Year

If your expectancy is $50/trade and you make 12 trades/year: $50 × 12 = $600/year on $10,000 capital = 6% return.

---

7. Average Days Held

What it measures: How long the typical trade stays open before being closed (by profit target, stop loss, DTE exit, or expiration).

StrategyTypical AverageFaster Means...Slower Means... Iron Condor15–30 daysProfit target hit quicklyTrades lingering, more risk Vertical Spread10–25 daysWorks well in trending marketsSideways markets Covered Call20–35 daysPremium captured quicklyStock moved against you | Straddle | 5–20 days | Volatility event occurred | Waiting for a move |

Why it matters:

Capital efficiency. Shorter holds mean your capital is free to be redeployed sooner.

Risk exposure. Every day you're in a trade is a day something can go wrong. Shorter holds = less exposure.

Trade frequency. If average hold is 15 days and you trade monthly, you're nearly always in a position.

What to watch for: If your average days held is close to your DTE exit threshold (e.g., 38 days held with a 7-DTE exit), most trades aren't hitting profit targets — they're being forced out. This suggests your profit target might be too aggressive.

---

8. Trades Per Month (or Total Trades)

What it measures: How frequently the backtester found and executed trades matching your criteria.

Why it matters:

Statistical significance. With only 20 trades over 10 years, your results are unreliable. You need 50+ trades minimum, ideally 100+.

Transaction costs. More trades = more commissions and slippage.

Compounding frequency. More trades allow faster compounding of gains.

---

How to Read Metrics Together (Not in Isolation)

Individual metrics tell you pieces of the story. Together, they paint the full picture. Here's a framework:

The "Green Light" Profile (Trade This Strategy)

Win Rate: Appropriate for strategy type (>70% for selling, >35% for buying)

Sharpe Ratio: >1.0

Max Drawdown: <20%

Profit Factor: >1.3

CAGR: >3% (on allocated capital)

Expectancy: Positive, >$30/trade

Total Trades: >50

The "Yellow Light" Profile (Proceed with Caution)

Win Rate: Slightly below benchmark

Sharpe Ratio: 0.5–1.0

Max Drawdown: 20–30%

Profit Factor: 1.0–1.3

CAGR: 1–3%

Expectancy: Positive but <$30/trade

Total Trades: 30–50

The "Red Light" Profile (Don't Trade This)

Win Rate: Well below benchmark

Sharpe Ratio: <0.5

Max Drawdown: >30%

Profit Factor: <1.0

CAGR: <1% or negative

Expectancy: Negative or near zero

Total Trades: <30 (insufficient data)

---

Putting It Into Practice

Here's a real example from OptionsPilot's backtester:

Strategy: Iron Condor on SPY, 30–45 DTE, 16-delta, 50% profit target, 200% stop loss

Results (2015–2025):

| Metric | Value | Rating | Win Rate83%Good Sharpe Ratio1.05Good Max Drawdown-17.3%Good Profit Factor1.52Good CAGR5.1%Average–Good Expectancy$48/tradeGood Total Trades128Sufficient

Verdict: Green light. Every metric is in the acceptable-to-good range. The max drawdown is manageable, the Sharpe is above 1.0, and we have enough trades for statistical confidence.

Now, what if we remove the stop loss?

MetricWith StopWithout StopImpact Win Rate83%86%+3% Max Drawdown-17.3%-34.8%-17.5% (much worse) Profit Factor1.521.28-0.24 Sharpe Ratio1.050.62-0.43 | CAGR | 5.1% | 4.7% | -0.4% |

Removing the stop loss increases win rate by 3% but doubles max drawdown and tanks the Sharpe ratio. The higher win rate is a mirage — you're winning more trades but exposing yourself to catastrophic risk.

This is exactly the kind of insight that reading metrics holistically provides.

---

Frequently Asked Questions

What's a good Sharpe ratio for options trading?

For retail options strategies, a Sharpe ratio between 1.0 and 2.0 is considered good. Above 2.0 is exceptional and should be scrutinized for over-fitting. Below 0.5 suggests the risk isn't worth the return. Professional quant funds target Sharpe ratios of 2.0–3.0, but they have access to leverage, market-making advantages, and sophisticated hedging that retail traders don't.

What max drawdown should I accept?

Most traders should limit strategies to a max drawdown of 20% or less. Beyond 20%, the psychological pressure to quit becomes intense and the math of recovery works against you (a 25% loss requires a 33% gain to recover). If you're backtesting and see a 30%+ drawdown, consider adding stop losses or VIX filters.

Is profit factor better than win rate?

Yes, for evaluating overall strategy quality. Profit factor accounts for both how often you win AND how much you win/lose, while win rate only tells you frequency. A profit factor above 1.3 is generally more meaningful than a high win rate with poor risk/reward.

How many trades do I need for reliable backtesting results?

At minimum, 30 trades for directional validity. For statistical confidence, 100+ trades is recommended. This is one reason to use long backtest periods (10+ years). With 120+ trades over 10 years, you can be reasonably confident that the results reflect a real edge rather than luck. OptionsPilot's 30-year data range makes it easy to generate sufficient trade count.

Should I optimize metrics individually?

No. Optimizing for win rate alone leads to poor risk/reward. Optimizing for CAGR alone leads to excessive risk. The best approach is to find parameters that produce acceptable values across *all* metrics simultaneously. If you have to sacrifice one metric, sacrifice CAGR before max drawdown.

---

Start Reading Your Backtests Like a Pro

The metrics aren't just numbers — they're a complete health check of your trading strategy. Now that you understand what each one means and what "good" looks like, you can make data-driven decisions about which strategies to trade and which to discard.

Run a Backtest and Apply These Metrics →