Absolute Locks
March Madness AI Brackets
2024 AI Model - an ensemble model using Random Forest and Logistic Regression models. The ensemble model was trained against every NCAA tournament game from 2007-2023 (>1000 games). The Logistic Regression model has an accuracy score of 75%, and is used to predict games with low seed variance (ex: [8 vs 9] and [7 vs 10]), where the random forest does not perform as well. The Random Forest model picks at an overall 70% accuracy, and is better at identifying upset candidates in the [6 vs 11] to [4 vs 13] range.
Using the 2006 NCAA Tournament as a test, the ensemble method correctly picked 25/32 games in the round of 64 (78%), 11/16 in the round of 32 (69%), 7/8 in the sweet sixteen (88%), 2/4 in the elite eight (50%), 2/2 in the final four (100%), correctly predicted a Florida vs UCLA championship game, and chose Florida as the national champion!
Metrics Used:
Seed Differential - a simple difference in seed (i.e. 1 - 16 = -15, 8 - 9 = 1)
Difference in the Number of Top 100 offensively rated players used in over 20% of team possessions - This parameter counts the number of efficient highly rated offensive players your team consistently uses, and is a proxy measure of the offensive balance that championship teams need. The 2023 UCONN team had 2 (Sanogo and Hawkins), the 2022 Kansas Team had 1 (McCormack), and the 2021 Baylor team had 3 (Butler, Teague, Mitchell).
Team 1's # of Rated Players - Team 2's # of Rated Players
Difference in the Number of "Shot Heavy" Players - I call this the "Purdue" Factor. This parameter counts the number of players on a team that take over 30% of their teams shots and is used to identify teams overly dependent on 1 or 2 players. 2023 Purdue had Zach Edey take 32.2% of shots, 2022 Purdue had Edey take 33% and Williams take 32.2%, 2021 Purdue had Williams take 36.7%. In each of the three years Purdue was upset by double-digit seeds. Fun note: 2024 Purdue does not have a player over 30%.
Team 1's # of Shot Heavy Players - Team 2's # of Shot Heavy Players
Difference in Team's Kenpom Adjusted Offensive Efficiency (AdjOE) -
Team 1's AdjOE - Team 2's AdjOE
Difference in Team's Kenpom Adjusted Defensive Efficiency (AdjDE) -
Team 1's AdjDE - Team 2's AdjDE
Difference in Team's Kenpom Adjusted Tempo (AdjT)
Team 1's AdjT - Team 2's AdjT
2023 AI Model - a Random Forest "moneyline" model trained against 2500 regular season games from the 2021-2022 season. Picked at a 69% rate against a set of training data from the 2021-2022 season.
Metrics Used:
Kenpom Adjusted Offensive Efficiency (AdjOE)
Kenpom Adjusted Defensive Efficiency (AdjDE)
Kenpom Adjusted Tempo (AdjT)