Predicting NBA MVP
Published:
By Max Rotblut
Predicting the MVP of a season using Machine Learning (Sklearn) with data from 2019-20 season to 2023-24 season for all players and data from the 1955-56 season for MVP’s
I did this using Machine learning, the model I created first applies a Standard Scalar transformation then a Linear Regression.
A standard scaler centers the data around 0 then makes the variance of the data 1, allowing the machine to easily understand the data is comparison to other columns. The following transformation is applied:
\[x_{scaled} = X - \frac{\mu}{\sigma}\]After the data is made uniform, a linear regression is applied, this assumes there is a relationship between the data and finds the relationship using ordinary least squares linear regression. Sklearn forms the following formula:
\[y = Xw + b\]Where
$y$ = the dependant variable (Whether or not a player is MVP)
$X$ = the matrix of input features
$w$ = the vector of coefficients
$b$ = the intercept term
This formula can then be used on new data to predict the dependant variable.
The model is 76% correct at predicting past MVP’s, and can be used to get an idea on who will be MVP for a season. It can’t be perfect because MVP is the opinion of a panel of sportswriters and broadcasters throughout the United States and Canada, therefore the model is trying to predict human decisions based on opinion without any information about the people.