Did your backtest catch its own lies?
Most don't. We'll tell you whether yours did — using the same validation rig that killed our own model repeatedly across eighteen months of work.
Null Hypothesis Labs is a backtest audit service for retail quants and small CTAs. We run your strategy through a rigorous statistical pipeline and deliver a written verdict in one to three days for a flat two thousand dollars.
Three slots per month. Async delivery, no live calls.
Methodology
What we actually do
You hand over your backtest — code, returns, or both. We run it through the same validation rig we built for our own trading research. Five tests, run in sequence, each one capable of killing the verdict on its own.
- Leakage audit. Fourteen-item checklist. Future information bleeding into past decisions. Train-test contamination. Look-ahead in feature construction. The kinds of mistakes that look like alpha.
- Cross-validation correctness. Purged k-fold or Combinatorial Purged CV (CPCV) — verified against the methodology Lopez de Prado published in 2017. Multi-seed if applicable. Embargo bands sized to your label horizons.
- Deflated Sharpe Ratio. Bailey & Lopez de Prado's correction for selection bias and non-normality. Tells you whether the Sharpe you're seeing would survive correction for the number of trials you actually ran.
- Probability of Backtest Overfitting (PBO). Bailey, Borwein, Lopez de Prado, Zhu's framework. Estimates the probability that the strategy you selected will underperform the median of trials out of sample.
- Randomization-test baseline. Shuffle your labels, refit, measure performance. If shuffled labels outperform real labels, your model isn't catching direction — it's catching artifacts.
If your strategy clears all five, you have something real. If it doesn't, we tell you exactly which test it failed and why.
Deliverables
What you get
One Verdict Report. Delivered within seven days of engagement start, most often within three.
Verdict Report PDF
- Executive summary, one page, plain language. Pass / partial / fail per test.
- Methodology section: every test we ran, with parameters and assumptions stated.
- Detailed statistical results: numbers, charts, comparisons.
- Findings: what's risky, what's wrong, what's defensible. Severity rated.
- Plain-language walkthrough of findings: what each test result means in context, where the verdict is solid, where the trade-offs sit. Written for an engineer who didn't run the audit.
- Validation rig version stamp on every numerical result. Every number is traceable.
- Required disclaimer per the engagement contract. Verbatim. Don't skip the disclaimer.
No live calls. No Zoom. No "let's hop on a quick one." Two written Q&A rounds are included — that's how questions get answered.
Pricing
Pricing
$2,000 flat
One backtest. Up to three days of work.
Verdict Report + two written Q&A rounds included.
No hourly billing. No surprise invoices. No upsell.
Three slots per month. First-come, first-paid. If we're full, you'll get a clear capacity-status reply within forty-eight hours of inquiring; we won't string you along.
What's not included
- Strategy design or alpha generation. We audit what you built. We don't build strategies for you.
- Code remediation. We tell you what's wrong; fixing it is on you (or your engineer).
- Trade sizing. Your sizing is part of your strategy. We measure how it performed; we don't redesign it.
- Live trading recommendations. We never tell you to deploy. The deployment decision is yours and should involve a licensed advisor.
That last bullet is non-negotiable, legal-required.
Process
How it works
-
Inquiry. You fill out the intake form below. Five minutes.
-
Scope and contract. Within one business day, typically, we send back a Statement of Work confirming scope and price. You sign the master engagement contract and the SOW; we counter-sign; payment is due before work starts.
-
Materials. You hand over your backtest — code (preferred, in any language we can run) or returns and trade logs (acceptable). We confirm receipt within one business day, typically. Client materials are reviewed only for the audit — never used as input to our own trading research, and walled off from our research workspace.
-
Audit. We run the validation rig. One to three days for most engagements. We don't ask follow-up questions during this phase unless something blocks the work.
-
Delivery. Verdict Report PDF arrives by email.
-
Two follow-up rounds. You get two written Q&A rounds included in the price. Async only — same email thread. Two business days, typically, per round.
-
Closeout. Your code and data are deleted from our active workspace within thirty days of closeout, or sooner if you request. Engagement records — brief, report, email log — are retained for three years.
Background
Who's behind this
Null Hypothesis Labs is a Colorado LLC. The validation infrastructure we sell here is the same rig we built — and use — for our own algorithmic trading research.
We don't claim to be profitable traders. The opposite is true. The rig has refused our own strategy multiple times across eighteen months. It caught a methodology bug in our own scoring framework. It found the actual deployable cell of our model and then told us not to deploy because the yield was inadequate. It pre-tested a projected yield-improvement we'd written into our roadmap and refuted it before we spent another dollar on it.
These moments where the rig refused to confirm what we wanted to believe — that's the discipline you're hiring. We don't sell our trading strategy: profitable quants trade their tools, they don't sell them. What we sell is the rig — the part that knows how to refuse.
Methodology lineage
The rig implements published academic methodology — Bailey & Lopez de Prado on Deflated Sharpe (2014); Bailey, Borwein, Lopez de Prado, Zhu on PBO (2014); Phillips, Wu & Yu on supremum ADF (2011, 2015); Lopez de Prado on purged cross-validation (2017). We did not invent these methods. We productized their honest application to backtest auditing.
Get started
Tell us about your backtest
Five minutes. We respond within forty-eight hours with capacity status, scope confirmation, and a Statement of Work.