Methodology · with receipts

How this measures up.

Every claim ships with the source data, the method, and a way to reproduce. We win on some horizons. We lose on others. Both numbers are published. The methodology is what stays.

Full backtest scoreboard →Scoreboard methodology →Library →

What we publish that the field does not

Side by side.

Eight methodology dimensions a sophisticated reader would ask about. KTC and FantasyCalc publish one number per player. FantasyPros publishes tiers. Our model publishes the math, the cohort, the curves, and the receipts. The cells below are specific rather than yes or no.

	Dynasty General this product	KTC	FantasyCalc	FantasyPros
Public backtest Spearman vs actual production, multiple horizons	✓Published	✕Not published	✕Not published	✕Internal only
Cohort calibration Real-roster percentile thresholds	✓n=82, 5 leagues	✕Single value	✕Single value	◐Tier breaks
Per-position age curves Empirically refit peak bands	✓4 positions, refit 05/12	◐Generic decay	✕Implicit	◐Tiered
Lane identity model Multi-attribute roster classification	✓11 lanes, 3 axes	✕None	✕None	✕None
Inflection scorecards Bimodal forecasts with signal scorecard + comparators	✓Bimodal + signals	✕None	✕None	✕None
Tunable model User-controlled engine weights	✓8 dials	✕None	◐Format toggles	✕Expert blend
Opponent dossiers Per-league trade fingerprints + pick patterns	✓Per opponent	✕None	✕None	✕None
Reproducibility Source CSV + scripts published	✓CSV + methodology	✕Closed	◐Public API	✕Paid only

Receipt 1 · backtest

We beat KeepTradeCut on every horizon.

We win convincingly on 1-year prediction. We trail FantasyPros ECR on 3-year. We publish both numbers and the loss function and the snapshot dates. Nothing cherry-picked.

Backtest, by horizon

Spearman rank correlation between each source's preseason ranking and actual cumulative fantasy points scored over the post-prediction window. Higher is better. Top-100 dynasty pool.

1-year cumulative

predicted 2024

DG wins

Dynasty General v1

0.346

FantasyPros ECR

0.200

FantasyPros ADP

0.093

KeepTradeCut

0.236

2-year cumulative

predicted 2023

DG · 0.474 vs winner 0.535

Dynasty General v1

0.474

FantasyPros ECR

0.433

FantasyPros ADP

0.535

KeepTradeCut

0.449

3-year cumulative

predicted 2022

DG · 0.393 vs winner 0.463

Dynasty General v1

0.393

FantasyPros ECR

0.463

FantasyPros ADP

0.463

KeepTradeCut

0.354

The honest read: Engine v1 wins convincingly on 1-year. We beat KeepTradeCut on every horizon we tested. We trail FantasyPros ECR and ADP on 2-year and 3-year. Both numbers are published; we do not cherry-pick. Full methodology + per-row reproduction at the scoreboard.

Receipt 2 · cohort

Calibrated against 82 real rosters.

The 11 lane-identity thresholds (Win-Now Floor, Balanced, Future Stock, RB Bellcow, WR Anchor, WR Stable, QB Stable, TE-Premium Lock, Trade Capital, Sustained Contender, Zero-RB) are tuned against the percentile distribution of this cohort, not picked by feel. Baked 2026-05-08 from 5 dynasty and keeper leagues.

Build fit

Your roster compared to the shapes of 82 real dynasty teams we benchmarked the model against. Each build below is a value-stack shape we found repeating across those real rosters. Your bar tells you how close your shape is to each pattern.

82 benchmark rosters · 5 leagues

Reading your shape

Your roster fits 1 build.

Strong fits: Future Stock. Partway into 1 more (Win-Now Floor).

What to do with this. The builds you fit name your trade leverage. Opponents who need pieces at those builds' signature positions are your best deal partners. Coach can route specific trade angles around these fits.

NO FITPARTIALFITS

Future Stock

FITS

PARTIAL 200FIT 320

your score 340 · above 83% of benchmarks · 20 points past the FIT threshold

Win-Now Floor

PARTIAL

PARTIAL 200FIT 300

your score 265 · above 57% of benchmarks · 35 points to FIT

Receipt 3 · inflection scorecards (only we publish this)

Bimodal forecasts, not single points.

When a player is in a high-variance moment (aging cliff, rookie debut, post-major-injury return), the conditional distribution of outcomes is bimodal. The mean lands in the valley between the two modes, where no actual player ends up. KTC and FantasyCalc collapse this into one number anyway. We refuse to. Each inflection window publishes both stories with calibrated probabilities, a signal scorecard with confidence labels, and named historical comparators.

Inflection scorecard · sample

Sample RB · age 28

window · aging cliff · RB

illustration · fictional player

42% · Continued bellcow

58% · Cliff inside one offseason

The model leans Story B (cliff) at 58% based on the 1,650 career carry load + team adding a young RB in the draft. YPC stability and pass-down share keep Story A alive at 42%. Single-point estimate would land in the valley.

Signal scorecard

Signal	Observation	Confidence	Points to
Career carries Cumulative regular-season + playoff carries	1,650 (top decile for age 28 RBs)	validated	→ Story B
YPC trend Yards per carry, last 2 seasons vs prime	4.6 → 4.4 (stable)	validated	→ Story A
Team RB draft Team drafted a young RB within last 18 months	Yes, pick 53 (round 2)	validated	→ Story B
Pass-down share Share of team passing-down RB snaps	62% (locked in for now)	partial	→ Story A
OL continuity Offensive line returning starters	4 of 5 returning	partial	→ Story A
Injury history Major-injury count past 3 seasons	data not in pipeline yet	weak	missing

Comparators · Continued bellcow

Adrian Peterson · 2017
rushing title at 32 after ACL recovery
Frank Gore · 2018
continued bellcow workload at 35

Comparators · Cliff inside one offseason

Le'Veon Bell · 2019
career fell apart at 27 post-holdout
DeMarco Murray · 2016
1,287 yards age 28, then cliff at 29

Per the calibration architecture: predicting a single point at a high-variance moment puts the estimate in the valley between modes, where no actual player ends up. The scorecard surfaces both stories, the signals that distinguish them with their confidence levels, and named historical comparators. The user reads the evidence and makes the call. The engine does not pretend to collapse a bimodal distribution into a point.

Receipt 4 · opponent fingerprints (only we publish this)

Reading counterparty behavior from pick-trade history.

Most dynasty tools look at a manager's roster shape. We look at how they actually behave. The engine classifies each opponent's pick-trade history into one of four signature buckets (flipper, hoarder, seller, quiet) and feeds the framing to Coach so trade proposals match what the opponent actually does.

Opponent fingerprint · sample

Sample Manager

classified · Pick flipper

illustration · sample data

▲ 17 sent▼ 11 receivednet -6volume 28

What we read

High two-way pick volume across 10 weeks. Sends and receives almost every week. Treats picks as currency rather than stockpiling them.

How to play them

Lead with a pick swap when proposing a trade. Player-for-player offers from your side will sit. Asymmetric (you pick + you player for their pick + their player) tends to get accepted when the volume matches their pattern.

Four signature buckets the engine assigns: pick flipper (high two-way volume, net flat), pick hoarder (net pick receiver), pick seller (net pick sender), pick quiet (low volume, no strong fingerprint). Coach references the signature when proposing trades with that opponent so the offer shape matches what they actually do, not just what their roster looks like.

Receipt 5 · doctrine readout

From dial fingerprint to algorithm to Coach output.

Move dials in the Rankings Lab; a doctrine line falls out the back. The line ("Aggressive Rebuilder · Soft Market · Confident") plus the dial-derived coefficient vector is what Coach references when it shapes a recommendation. Three preset doctrines side by side to show how the math actually changes.

Doctrine readout · sample

Three preset doctrines from the Rankings Lab. The dial fingerprint + the synthesized doctrine line + the algorithm it implies, side by side. Tune your own at /rankings; the readout updates as you move dials.

Aggressive Rebuilder

Max future. Trade win-now for picks. Rookies and 2026-2028 upside, all of it.

Aggressive Rebuilder · Skeptic · Decisive

7 of 8 dials moved

Dial fingerprint

Youth weight

+80

Bellcow preference

-20

Coaching continuity

Horizon

+95

Rookie tilt

+90

Risk tolerance

+60

Trade aggression

+60

Consensus lean

-30

DG(p)=market+60·(0.80·Y+-0.20·B+0.00·C)

Patient Contender

Compete this season without torching future. Iceman default.

Contender · Balanced · Balanced

8 of 8 dials moved

Dial fingerprint

Youth weight

-10

Bellcow preference

+30

Coaching continuity

+30

Horizon

-30

Rookie tilt

-10

Risk tolerance

-10

Trade aggression

+20

Consensus lean

+10

DG(p)=market+60·(-0.10·Y+0.30·B+0.30·C)

Win-Now Maxer

All chips in. Vets only, prime production, future is a rounding error.

Aggressive Win-Now · Soft Market · Decisive

8 of 8 dials moved

Dial fingerprint

Youth weight

-60

Bellcow preference

+80

Coaching continuity

+60

Horizon

-95

Rookie tilt

-80

Risk tolerance

+40

Trade aggression

+80

Consensus lean

+30

DG(p)=market+60·(-0.60·Y+0.80·B+0.60·C)

Same engine, three different doctrines. The dial fingerprint produces a doctrine line ("Aggressive Rebuilder · Soft Market · Confident"), which produces a coefficient vector, which produces a ranking. Coach sees the doctrine line on every request and references it explicitly when the user has tuned dials off default.

Receipt 6 · build fit

Eleven builds, multi-attribute roster classification.

A roster is not one archetype. It fits a vector across eleven builds, each with its own threshold calibrated against the cohort. Builds overlap by design; the composite builds (Sustained Contender, Zero-RB) are derived from the constituent base-build fits. Venn shows three builds for legibility.

Build fit · sample

One roster scored across three of the eleven builds. Builds overlap by design: a roster that fits Win-Now Floor AND RB Bellcow AND WR Anchor also fits the composite Sustained Contender build.

Win-Now Floor

Fits when the top-K contributors' value sum exceeds the cohort threshold for proven win-now production. Calibrated from the 82-roster cohort.

RB Bellcow

Fits when the roster's RB room scores above the workhorse threshold. Specific to RB and weighted by role tier when the signal table fills out.

WR Anchor

Fits when the roster's top WR clears the anchor threshold (high-value-70+ WR with surrounding production).

Composite · Sustained Contender

Derived from constituent builds. A roster that fits all three base builds above mathematically fits Sustained Contender. Architecturally distinct from a single-axis archetype assignment.

The Venn shows three builds for legibility; the full model scores all eleven (Win-Now Floor, Balanced, Future Stock, RB Bellcow, WR Anchor, WR Stable, QB Stable, TE-Premium Lock, Trade Capital, Sustained Contender, Zero-RB). A roster's full identity is a vector across all eleven builds, not a single archetype label.

Receipt 7 · age curves

Peak bands refit when the data says so.

Four position-specific curves derived from NFL starter production 2022 to 2025. Refits are dated and traceable. Most products use "RB cliff at 30" as a rule of thumb. We use a band that we update when the cohort moves.

Age curves, per position

Peak bands derived from 2022 to 2025 NFL starter production cohorts. The shaded region is the calibrated peak; the line is the engine's age multiplier from 20 to 36.

peak 23 to 26

peak 24 to 29

peak 26 to 30

Narrowed 2026-05-12. Age 25 produced 0.78 of peak in starter cohort, not 1.0.

peak 23 to 33

Widened lower bound to 23 on 2026-05-12. Year-1-3 starters (Daniels, Stroud, Caleb shape) post peak-band production.

Curves are refit when validation surfaces show the band edges no longer match data. Two refits on 2026-05-12: QB peak widened lower to 23 because year-1-3 starters now post peak-band production; TE narrowed to 26-30 because age 25 produces 0.78 of peak rather than 1.0.

Receipt 8 · LLM context

Coach reads the full named roster.

Every Coach request ships the user's named players (name, position, age, team, KTC value), format rules (starter slots, K and DST presence, superflex, TE-premium), and a pricing block with KTC-anchored pick values. No "you need a TE" when Kincaid is on roster. Three-layer fix for every hallucination class: explicit context field, hard system-prompt rule referencing it, pre-LLM guard.

What other tools send their LLM

Available pool: 247 players
Your roster: 23 players
Question: should I trade for Bijan?

(model has to guess at format,
guess at what's on the roster,
guess at the trade math)

What Coach receives per request

format_rules:
  is_superflex: true
  qb_starters_max: 2
  te_premium: true
  has_k: false
  has_dst: false

your_roster:
  Patrick Mahomes (QB, age 30, KC, KTC 85)
  Jahmyr Gibbs (RB, age 24, DET, KTC 100)
  Travis Kincaid (TE, age 22, BUF, KTC 78)
  [... 20 more]

pricing:
  pick_values: { "2026.1.05": 92, ... }
  player_values_present: true

question: should I trade for Bijan Robinson?

Receipt 9 · engineering

71 evals + 12 anti-pattern lint rules.

Every fix ships with a test. Pre-deploy audits cover security, cost, legal, trade realism, and LLM context hallucinations. The lint rules are what catch a class of bug before it ships rather than after a user finds it.

Eval suite (71 tests)

· Lane identity calibration regressions
· Format-aware starter math (1QB / SF / 2QB / flex)
· Survival pct + availability bucket consistency
· Anti-pattern lint (hardcoded thresholds, raw hard.QB)
· Em-dash check (never ships)
· Rerank cascade ordering
· Snake-pick math + traded-pick resolution

Pre-deploy audits

· Security: OWASP + SSRF + Anthropic-key leakage
· Cost: per-endpoint LLM budget worst-case
· Legal: privacy, ToS, GDPR/CCPA
· Trade realism: KTC pricing within ±15%
· Context doctor: LLM hallucination triage
· Microcopy: brand voice consistency

What we do not claim

Limits, named.

Three years of NFL data, not thirty. The age-curve calibration cohort spans 2022 to 2025. Older seasons exist; we are not yet using them.
82 rosters, not 8,200. The cohort is dynasty and keeper leagues we have explicit access to. Larger cohort means tighter percentile estimates; we are adding leagues over time.
OL transitions and OC tenure not fully ingested. The team_signals table that powers Coaching Continuity is mid-calibration. The dial reads as neutral until the table is complete.
FantasyPros wins us on 3-year. Their expert blend has access to in-season ADP revisions; our preseason model does not. Closing the gap is the next calibration target.
No future prediction. The model reads what your opponents have done, what the cohort looks like today, and how a player fits an identity vector. It does not predict trades that have not happened or game-script that is not real yet.

Source CSV for the backtest: data/scoreboard/scoreboard_v1.csv. Cohort statistics: src/lib/strategy/lane-identity/cohort-stats.ts. Age-curve definitions: src/lib/strategy/lane-identity/lanes.ts. Every number on this page reproducible from the repository.