Why We're Sharing This

Most betting analytics platforms treat their models as black boxes. They show you projections but never explain how those numbers were generated.

We think that's backwards. If you're trusting our projections to inform your betting decisions, you deserve to know:

What data we use
How we process it
Why certain players get certain projections
Where our models perform well (and where they struggle)

This isn't marketing speak. This is the technical reality of how THE LINEUP works.

Features per prediction

Models in production

Seasons of training data

The Core Problem We're Solving

Player performance prediction is hard because:

High variance: Even elite players have bad games
Context matters: Matchups, pace, home/away, rest days all affect output
Line movements: Props move based on information we may not have
Injuries: Missing teammates change usage patterns

Books set lines based on these factors. To find +EV, we need to model them better.

Our Approach: Stat-Specific Tiered Models

The key insight: A player's tier for points is different from their tier for rebounds.

LeBron is an elite scorer AND an elite rebounder. But a defensive center averaging 8 rebounds might only score 6 points. Using minutes played to group players fails because it treats all stats the same.

Our solution: Separate tier classifications for each stat.

How Tiers Are Assigned

| Stat | Elite Threshold | Solid Threshold | What Elite Means | |------|-----------------|-----------------|------------------| | Points | 22+ PPG | 12+ PPG | Primary scorers | | Rebounds | 8+ RPG | 5+ RPG | Dominant rebounders | | Assists | 6+ APG | 3+ APG | Primary playmakers | | Steals | 1.3+ SPG | 0.8+ SPG | Perimeter disruptors | | Blocks | 1.2+ BPG | 0.6+ BPG | Rim protectors | | 3PM | 2.5+ per game | 1.2+ per game | Volume shooters |

Why This Matters

Improvement from Stat-Specific Tiering

The old approach under-predicted elite scorers by 25-30% because they got grouped with 15 PPG players in the same "stars" tier. Now, Shai Gilgeous-Alexander gets predicted using a model trained only on 22+ PPG scorers.

The 104 Features We Use

Every prediction uses 104 carefully selected features:

Recent Performance (Rolling Windows)

Last 3 games average (L3)
Last 5 games average (L5)
Last 10 games average (L10)
Season average
Variance and consistency metrics
Hot/cold form indicators

Contextual Factors

Opponent defensive rating vs. position
Pace of play (possessions per game)
Home/away splits
Rest days since last game
Back-to-back indicator
Minutes analysis

Injury-Aware Features (v3.8)

Teammate injury impact on usage
Recovery trajectory tracking
Minutes restriction indicators
Lineup changes due to injuries

Line Movement Features (v3.8)

Opening vs. current line difference
Movement direction and magnitude
Sharp money indicators
Consensus vs. outlier lines

How a Projection Gets Made

Here's the actual flow for generating a projection:

Step 1: Feature Assembly

Player: Jayson Tatum
Game: vs. Lakers (away)
Features: L3=28.2, L5=27.8, L10=26.4, season=26.1
Context: LAL allows 24.3 PPG to SFs, pace=100.2, rest=1 day

Step 2: Tier Classification

PTS tier: Elite (26.1 > 22.0 threshold)
REB tier: Solid (8.5 > 5.0 but < 8.0)
AST tier: Solid (4.8 > 3.0 but < 6.0)

Step 3: Model Selection

Points: Load pts_elite_v3.8_production.pkl
Rebounds: Load reb_solid_v3.8_production.pkl
Assists: Load ast_solid_v3.8_production.pkl

Step 4: Prediction Generation

Points projection: 25.8
Rebounds projection: 8.2
Assists projection: 4.6

Step 5: Confidence Interval

PTS range: 19.2 - 32.4 (model uncertainty + variance)

The Algorithm: LightGBM

We use LightGBM (Light Gradient Boosting Machine), not because it's trendy, but because:

Handles mixed feature types well - Numeric stats + categorical context
Fast training and inference - We retrain daily, need speed
Robust to overfitting - Regularization built-in
Interpretable - We can see which features matter most

We evaluated random forests, XGBoost, and neural networks. LightGBM consistently performed best on our validation sets while being 3-5x faster to train.

Where We Struggle (Honest Assessment)

No model is perfect. Here's where ours has known weaknesses:

1. Blowouts

If a game becomes lopsided, starters sit in the 4th quarter
We don't predict game flow (yet), so our minutes assumptions can be wrong

2. Breaking News

Trade deadline, unexpected injuries, load management
Our features lag by at least one game

3. Low-Volume Stats

Steals and blocks are highly variable game-to-game
Even 60-70% hit rates mean 30-40% misses

4. First Games Back

Players returning from injury have unpredictable minutes
We weight recent form, which may be stale

How We Measure Accuracy

We track accuracy publicly on our accuracy dashboard. Key metrics:

MAE (Mean Absolute Error): Average prediction miss in raw stat units
Hit Rate: Percentage of over/under calls that were correct
Calibration: How well our confidence ranges match reality

We don't cherry-pick results. Every projection we make gets tracked and graded automatically when games complete.

Overall MAE Improvement

Low REB MAE Improvement

Low BLK MAE Improvement

Low AST MAE Improvement

The Daily Pipeline

Every day, our system:

6:00 AM - Sync overnight game results
6:15 AM - Update player rolling averages
6:30 AM - Recalculate tier assignments
7:00 AM - Generate projections for upcoming games
7:30 AM - Compare projections to current prop lines
8:00 AM - Identify +EV opportunities
11:00 PM - Auto-settle yesterday's picks, calculate CLV

This runs automatically. No human intervention unless something breaks.

What's Next: Continuous Improvement

We're continuously improving. Current experiments:

Lineup-aware projections - Adjust for specific 5-man rotations
Real-time game state - Factor in current score and time
Expanded sports - NFL and NHL using similar architecture
Opponent player props - How do defenders affect specific stats?

We validate everything before production. Most experiments fail. The ones that work become features.

Why Transparency Matters

Our Philosophy

If we can't explain why a projection is what it is, we shouldn't be confident in it. Transparency isn't just good ethics - it's good modeling practice. By showing our work, we invite scrutiny that makes us better.

Other platforms might have good models. But if they won't tell you how they work, how can you trust them when they're wrong?

We believe in showing our hit rates, explaining our methodology, and being honest about our limitations. That's what makes THE LINEUP different.

See our accuracy in action: View the public accuracy dashboard - updated daily with real results.

Ready to try it?: Get today's projections and see our ML models in action.

Ready to Use the Board?

Open Best Bets, compare current prices, and track what you play.

View Best Bets

Transparency

NBA Player Prop Predictions: How We Improved Accuracy by 13.5%

Our v4.0 NBA player prop prediction engine reduced error by 13.5% with EWMA features, Huber loss, and defense-vs-position data. Full technical breakdown inside.

Best NBA Prop Bets Today: How AI Finds Profitable Picks Every Night

How our AI model evaluates every NBA player prop to find profitable picks. 139,000+ NBA picks tracked at 53.5% win rate. The methodology, the edge, and why unders outperform.

How We Track Accuracy (And Why Most Services Don't)

How we grade and verify 193,000+ betting picks against official box scores. Our A+ through C confidence system, settlement methodology, and why most services hide their records.

Why We're Sharing This

Most betting analytics platforms treat their models as black boxes. They show you projections but never explain how those numbers were generated.

We think that's backwards. If you're trusting our projections to inform your betting decisions, you deserve to know:

What data we use
How we process it
Why certain players get certain projections
Where our models perform well (and where they struggle)

This isn't marketing speak. This is the technical reality of how THE LINEUP works.

Features per prediction

Models in production

Seasons of training data

The Core Problem We're Solving

Player performance prediction is hard because:

High variance: Even elite players have bad games
Context matters: Matchups, pace, home/away, rest days all affect output
Line movements: Props move based on information we may not have
Injuries: Missing teammates change usage patterns

Books set lines based on these factors. To find +EV, we need to model them better.

Our Approach: Stat-Specific Tiered Models

The key insight: A player's tier for points is different from their tier for rebounds.

Our solution: Separate tier classifications for each stat.

How Tiers Are Assigned

Why This Matters

Improvement from Stat-Specific Tiering

The 104 Features We Use

Every prediction uses 104 carefully selected features:

Recent Performance (Rolling Windows)

Last 3 games average (L3)
Last 5 games average (L5)
Last 10 games average (L10)
Season average
Variance and consistency metrics
Hot/cold form indicators

Contextual Factors

Opponent defensive rating vs. position
Pace of play (possessions per game)
Home/away splits
Rest days since last game
Back-to-back indicator
Minutes analysis

Injury-Aware Features (v3.8)

Teammate injury impact on usage
Recovery trajectory tracking
Minutes restriction indicators
Lineup changes due to injuries

Line Movement Features (v3.8)

Opening vs. current line difference
Movement direction and magnitude
Sharp money indicators
Consensus vs. outlier lines

How a Projection Gets Made

Here's the actual flow for generating a projection:

Step 1: Feature Assembly

Player: Jayson Tatum
Game: vs. Lakers (away)
Features: L3=28.2, L5=27.8, L10=26.4, season=26.1
Context: LAL allows 24.3 PPG to SFs, pace=100.2, rest=1 day

Step 2: Tier Classification

PTS tier: Elite (26.1 > 22.0 threshold)
REB tier: Solid (8.5 > 5.0 but < 8.0)
AST tier: Solid (4.8 > 3.0 but < 6.0)

Step 3: Model Selection

Points: Load pts_elite_v3.8_production.pkl
Rebounds: Load reb_solid_v3.8_production.pkl
Assists: Load ast_solid_v3.8_production.pkl

Step 4: Prediction Generation

Points projection: 25.8
Rebounds projection: 8.2
Assists projection: 4.6

Step 5: Confidence Interval

PTS range: 19.2 - 32.4 (model uncertainty + variance)

The Algorithm: LightGBM

We use LightGBM (Light Gradient Boosting Machine), not because it's trendy, but because:

Handles mixed feature types well - Numeric stats + categorical context
Fast training and inference - We retrain daily, need speed
Robust to overfitting - Regularization built-in
Interpretable - We can see which features matter most

We evaluated random forests, XGBoost, and neural networks. LightGBM consistently performed best on our validation sets while being 3-5x faster to train.