From ’80 to Today: Modeling NBA Win Chances

Predicting Team Success Through Statistical Analysis

Sports Analytics Regression Modeling Data Visualization Statistical Analysis

Executive Summary

This project analyzes 40 years of NBA data (1979–2019) to identify which statistical factors most influence team success and creates a predictive model that can estimate a team's wins based on their performance metrics.

Key Achievement

Developed a linear regression model that accurately predicts NBA team wins based on key performance statistics, with shooting efficiency, offensive rebounds, and turnovers emerging as the most influential factors.

Years of Data

40

Comprehensive historical analysis

Key Variables

9

Statistical factors in final model

Focus Period

1979+

Post three-point line introduction

Key Findings

Most Influential Factors

The analysis revealed several key statistics that have the strongest impact on an NBA team's win total:

  • 2-Point Shooting Percentage — Efficiency in traditional field goals remains crucial
  • Offensive Rebounds — Second-chance opportunities significantly impact wins
  • Free Throw Percentage — Converting from the line translates directly to wins
  • Turnovers — Teams that protect the ball win more games
Statistical Importance Chart

Relative importance of statistical factors in predicting wins

Evolution of the Game

While basketball has evolved, fundamentals like rebounds, steals, and assists have remained relatively stable. Three-point attempts and makes have dramatically increased (especially 2011–2019) while two-point attempts declined.

Three-Point Evolution Chart

Evolution of three-point shooting in the NBA (1979–2019)

Methodology Overview

This project followed a structured analytical approach to develop a reliable predictive model:

  1. Data Collection — NBA team statistics from 1979–2019
  2. Data Preparation — Cleaned and standardized metrics across seasons
  3. Exploratory Analysis — Identified trends and relationships
  4. Model Development — Built and refined multiple regressions
  5. Statistical Testing — Multicollinearity and heteroskedasticity checks
  6. Model Validation — Residual diagnostics and fit
Model Accuracy Chart

Normal probability plot of residuals indicating model validity

Model Refinement Process

We removed collinear variables (e.g., raw points) and reduced mean VIF to 1.98. The final model passed heteroskedasticity tests (p = 0.5167).

VIF Analysis Table

Variance Inflation Factor (VIF) analysis showing low multicollinearity

Final Prediction Equation

ŵ = -169.712 + 260.7082·2P% + 48.4078·3P% + 34.8791·FT% + 2.0234·ORB + 0.6295·DRB + 0.3581·AST + 2.0269·STL + 2.7048·BLK - 2.6861·TOV

Business Applications

Team Management

Evaluate roster changes and prioritize the stats that most move wins.

Sports Betting

Leverage early-season stats to form more accurate performance priors.

Coaching Strategy

Emphasize shooting efficiency, offensive rebounding, and ball security.

Strategic Insight

Prioritize efficiency over volume, hunt second-chance points, and reduce turnovers.

Conclusion & Future Work

The model highlights the statistical levers that most reliably translate to wins, offering a data-driven lens for decisions.