The Numbers
We shipped v4.0 of our projection engine last week. Here's the headline: 13.55% lower error across all stat categories, measured against our v3.8 production baseline.
That's not a cherry-picked number. It's the weighted Mean Absolute Error across every player, every stat, every game in our validation set.
MAE Improvement (weighted)
Features per prediction
Models in production
What Changed
Three things drove the improvement. Each one contributed independently, and they compound when combined.
1. Exponentially Weighted Moving Averages (EWMA)
The old model used simple rolling averages: last 3, 5, and 10 games, weighted equally. But a game from 10 days ago shouldn't count as much as last night's game.
EWMA solves this by giving more weight to recent games. A player who scored 35 last night and 20 five games ago gets a higher projection than one who scored 20 last night and 35 five games ago -- even though their simple averages are identical.
We added 30 new EWMA features covering every stat category, each with tuned decay parameters.
2. Huber Loss (Outlier Robustness)
Standard Mean Squared Error penalizes outliers heavily. If a player averages 25 points but drops 45 in a blowout, MSE treats that as a massive error and warps the model toward extreme predictions.
Huber loss uses squared error for small residuals but switches to linear error for large ones. Translation: the model learns from normal games without getting distracted by statistical anomalies.
Combined with adaptive outlier weighting during training, this made predictions more stable without sacrificing accuracy on high-variance players.
3. Defense vs Position (DVP) Features
We already tracked opponent defensive rating, but v4.0 adds granular position-specific defense data. How many points do the Lakers allow to point guards? How many rebounds do the Celtics give up to centers?
This matters because defensive matchups are position-dependent. A team might be elite at defending guards but below average against big men. The old model treated defense as a single number. Now it's 15 features capturing how each opponent defends each position.
We also added home/away splits to these features, since some teams defend differently at home.
Results by Category
Not all stats improved equally. The biggest gains came where it matters most.
MAE Improvement by Stat Category
Blocks saw the largest improvement (+16.1%). This makes sense -- blocks are highly dependent on matchup (guards don't block much against big lineups) and the DVP features capture this directly.
Rebounds improved 15.2%, driven by EWMA's ability to capture short-term trends in rebounding opportunity (e.g., when a teammate is injured and rebounding volume shifts).
Steals improved the least (+9.8%), consistent with steals being the most random NBA stat. Even with better features, there's a floor on how predictable a steal is.
The Architecture
Model Type: LightGBM (gradient boosting)
Training Data: 3 seasons of NBA player game logs (~150K+ rows)
Feature Count: 187 total (142 from v3.9 + 30 EWMA + 15 DVP/home-away variance)
Model Structure: 42 models total -- one per stat per tier
For each of 14 stat categories:
- Elite tier model (high-volume players)
- Solid tier model (mid-volume players)
- Low tier model (low-volume / role players)
= 42 production models
Loss Function: Huber loss (delta=1.35)
Regularization: L1 + L2, min_child_samples=20, max_depth=6
What This Means for Betting
Better projections directly translate to better prop bet identification.
When our projection for a player is 25.8 points and the book has the line at 23.5, that 2.3-point edge is more reliable with a 13.5% lower error rate. Fewer false positives means more consistent +EV identification.
Since deploying v4.0:
- Prop recommendations have shown improved hit rates in early tracking
- Edge sizes are more stable game-to-game
- Fewer "surprise" misses on high-confidence picks
Overall MAE (v4.0)
Overall MAE (v3.8)
MAE Reduction
Production Models
How We Validate
Every improvement claim is backed by rigorous validation:
- Holdout test set: Models never see the test data during training
- Walk-forward validation: We simulate real deployment conditions (train on past, predict future)
- Per-tier comparison: We check that each tier improves, not just the aggregate
- A/B logging: v4.0 ran in shadow mode alongside v3.8 for a week before promotion
We don't cherry-pick validation windows. The 13.55% number is the weighted average across all tiers and all stat categories.
What's Next
v4.0 is a foundation, not a finish line. We're currently experimenting with:
- Lineup-aware features: Adjusting projections based on confirmed starting lineups and rotation patterns
- In-game momentum: Using first-half performance to update second-half projections in real time
- Cross-stat correlations: If a player's assists are high, their points might be lower (playmaking vs scoring mode)
- Expanded validation: Tracking hit rates against actual prop lines, not just raw stat accuracy
Every experiment gets the same rigorous validation before production. Most ideas fail. The ones that work become the next version.
Why We Publish This
Transparency is our edge
Most prop betting tools show you a number and say "trust us." We show you the methodology, the validation, and the limitations. If our models are wrong, we want to know why -- and we want you to know too. That accountability makes us better.
See it in action: View today's projections powered by v4.0
Track our accuracy: Public accuracy dashboard -- updated daily with real results