Projections using Hypertuned model through XGboost
All data is from FanGraphs. I have no affiliation with FanGraphs, but please consider contributing to their website if you found this project informative.
This project is designed to showcase how Using a Percentile Based Worth System values Fantasy Baseball Players through a Inning Pitched (IP) weighted projection
The Categories used for prediction valuation are year-end rankings for the following metrics:
First we need to load the packages that R needs to run the analysis
library(sqldf) #SQL in R
library(skimr) #Summaries and useful for removing low % data
library(ggplot2) #Plotting Functions
library(plyr) #slightly deprecated data cleaning
library(dplyr) #slightly updated data cleaning
library(tidyverse) #tidyverse data cleaning universe
library(caret) #wrapper for creating, tuning and validating models
library(xgboost) #package for creating regression tree model
library(vtreat) # useful package for treating data before modeling
library(Matrix) #creating matricies for xgboost
library(mgcv)
library(moments) #for measuring skewness
library(data.table) #alternative to dplyr we use to create lags
library(pdp) #partial dependence graphs
library(vip) #variable importance
library(grid) #put multiple plots on one grid
library(gridExtra) #additional grid functionality
library(janitor) #one function used to clean transposed data set
library(ggpubr) #for qq plot
library(owmr) #Removing Prefixes
library(kableExtra) # formatting HTML Tables
library(formattable) # formatting HTML TablesThe # comments generally explain what additional functionality each library adds to R
All data is downloaded from Fan Graphs from this location. The data is also available on my Github here. There are player level and team data sets
#data read-in
pitcher_data <- read_csv("FanGraphs Leaderboard_Pitching20IP.csv")#Team datasets
FDG_Team = read_csv("FanGraphs Leaderboard_Team.csv")#Create a prefix for all team stats that starts with T_
FDG_Team2 <- FDG_Team %>%
rename_with( ~ paste0("T_", .x))str give information about an object, while
skim provides a customizable summary
#Output not shown for space
#str(FDG_Team2)
skim(FDG_Team2) %>%
tibble::as_tibble() #Remove this option for a normal HTML tableskim let’s us see how the data was imported into R.
Documentation can be found here
#Full Dataset dimensions
skimr::skim(pitcher_data) %>%
tibble::as_tibble() %>% #Remove this option for a normal HTML table
select(skim_type,skim_variable,complete_rate) %>%
filter(complete_rate >0.30) #250 Variables
#skim_type - character or numeric
#skim_variable - name of variable
#complete_rate - % of data that is not missing
#filter - only keep variables that have 30% of data populatedAdditionally let’s look at how variables vary by year to see if there are any discrepancies there
#It looks like one year, there were fewer games played, and there is a clear drop off in home runs
pitcher_data_dist =
pitcher_data %>%
group_by(Season) %>%
summarize (Max_Games = max(G),
Avg_W= mean(W)
)
pitcher_data_dist
#Plot Win Data by Year
ggplot(pitcher_data_dist, aes(Season, Avg_W)) +
geom_col()+
ggtitle("Average Wins by Year")+
theme(plot.title = element_text(hjust = 0.5,size = 22,color ="steel blue"))What are some issues with the data?
Many of Variables, such as K%, are being read in as characters
There is spotty data coverage in some of the variables (~Variables have less than 30% Coverage)
2020 Data only includes 60 games worth of data
Team Data needs to be appended to pitcher Data by Team Name
There are several ways to do this, we will identify the variables we
want to change that are mis-identified. parse_number can be
used to pull numbers from these variables. Additional ways to tackle
this can be found here.
#Select Column names that are characters but not Team or Name, These should be percentages
pitcher_data_chars_to_convert <- pitcher_data %>%
select_if(is.character)%>% select(-Team,-Name) %>%
mutate_all (function(x) as.numeric(readr::parse_number(x))/100)
#Note : There are additional ways to do this, this is just one solution
#We can exclude the variables we converted and reintroduce them
pitcher_data_num <- pitcher_data %>% select(-colnames(pitcher_data_chars_to_convert))
pitcher_data2 = cbind(pitcher_data_num,pitcher_data_chars_to_convert) %>%
select (colnames(pitcher_data)) %>% #preserve original order
dplyr::rename(flyball_perc = `FB%...50`,fastball_perc = `FB%...74`) #rename two ambiguous columns
skim(pitcher_data2) %>%
as_tibble() %>%
group_by(skim_type) %>%
count()
#Logical variables are R's best guess, in our case they are all NA's and will be removed at a later stepThe same can be done for the Team Data that is loaded
#Select Column names that are characters but not Team or Name, These should be percentages
FDG_Team2_chars_to_convert <- FDG_Team2 %>%
select_if(is.character)%>% select(-T_Team) %>%
mutate_all (function(x) as.numeric(readr::parse_number(x))/100)
#Keep in mind, parse number may make actual characters into numerical variables so carefully check your data before using
#We can exclude the variables we converted and reintroduce them
FDG_Team2_num <- FDG_Team2 %>% select(-colnames(FDG_Team2_chars_to_convert))
FDG_Team3 = cbind(FDG_Team2_num,FDG_Team2_chars_to_convert) %>%
select (colnames(FDG_Team2)) %>% #preserve original order
dplyr::rename(T_flyball_perc = `T_FB%...45`,T_fastball_perc = `T_FB%...72`) #rename two ambiguous columns
skim(FDG_Team3) %>%
as_tibble() %>%
group_by(skim_type) %>%
count()I choose 30% coverage of data necessary but this can be adjusted up
or down. This will also get rid of columns that are all
NA.
# Keep variables with enough values (Need 30% data coverage rate here)
Player_cols_to_keep =
skim(pitcher_data2) %>%
dplyr::select(skim_type, skim_variable, complete_rate) %>%
filter (complete_rate > 0.30)
#Transpose Rows to get column names as skim melts the data
Player_cols_to_keep_transpose = t(Player_cols_to_keep)
#extract the colnames we would like to keep
Player_cols_to_keep = colnames(janitor::row_to_names(Player_cols_to_keep_transpose,row_number = 2))
#Only keep the columns designated to have over 30% of their data populated or greater
pitcher_data3 = pitcher_data2 %>%
select(one_of(Player_cols_to_keep)) Repeat the process for Team Variables
Team_cols_to_keep =
skim(FDG_Team3) %>%
dplyr::select(skim_type, skim_variable, complete_rate) %>%
filter (complete_rate > 0.30)
#Transpose Rows to get column names as skim melts the data
Team_cols_to_keep_transpose = t(Team_cols_to_keep)
#extract the colnames we would like to keep
Team_cols_to_keep = colnames(janitor::row_to_names(Team_cols_to_keep_transpose,row_number = 2))
#Only keep the columns designated to have over 30% of their data populated or greater
FDG_Team4 = FDG_Team3 %>%
select(one_of(Team_cols_to_keep)) Some Variables will need to be normalized by Innings_Pitched (IP) if they aren’t a percentage already. Remaining Variables are percentages or indices so will not need to be transformed. The full data dictionary for these variables can be found on FanGraph’s website here. for pitching variables and here. for hitting variables.
pitcher_data4 = pitcher_data3 %>%
mutate( #create new variables based on existing variables
W_IP = W/IP,
L_IP = L/IP,
ShO_IP = ShO/IP,
SV_IP = SV/IP,
BS_IP = BS/IP,
TBF_IP = TBF/IP,
H_IP = H/IP,
R_IP = R/IP,
ER_IP = ER/IP,
HR_IP=HR/IP,
BB_IP=BB/IP,
IBB_IP=IBB/IP,
HBP_IP=HBP/IP,
WP_IP= WP/IP,
BK_IP=BK/IP,
SO_IP=SO/IP,
GB_IP = GB/IP, #Groundballs
FB_IP = FB/IP, #FlyBalls
LD_IP = LD/IP, #LineDrives
IFFB_IP = IFFB/IP, #Infield Fly balls
Balls_IP= Balls/IP,
Strikes_IP= Strikes/IP,
Pitches_IP= Pitches/IP,
RS_IP= RS/IP,
IFH_IP= IFH/IP,
BU_IP= BU/IP,
BUH_IP= BUH/IP,
Pulls_IP= Pulls/IP,
HLD_IP= HLD/IP,
SD_IP= SD/IP,
MD_IP= MD/IP,
Barrels_IP= Barrels/IP,
HardHits_IP= HardHit/IP
) %>% select(-L,-G,-IP,-ShO,-BS,-(TBF:BK),-(GB:BUH),-Pulls,-(SD:MD),-Barrels,-HardHit,-Events)
#will be removed after data is lagged -FIP,-(RAR:WPA),,-(wFB:wCH),-(`ERA-`:`xFIP-`),-SIERA,-(`RA9-WAR`:`Age Rng`),-kwERA,-`wCH (pi)`:`wSL (pi)`,`K/9+`:`HR/FB%+`)
#skim(pitcher_data4) %>% as_tibble()Repeat the process for Team Variables
FDG_Team5 = FDG_Team4 %>%
mutate( #create new variables based on existing variables
T_H_T_PA = T_H/T_PA,
T_x1B_T_PA = T_1B/T_PA, #note: R can't have variables start with a number
T_x2b_T_PA = T_2B/T_PA,
T_x3b_T_PA = T_3B/T_PA,
T_HR_T_PA = T_HR/T_PA,
T_R_T_PA = T_R/T_PA,
T_RBI_T_PA = T_RBI/T_PA,
T_BB_T_PA = T_BB/T_PA,
T_IBB_T_PA = T_IBB/T_PA,
T_SO_T_PA=T_SO/T_PA,
T_HBP_T_PA=T_HBP/T_PA,
T_SF_T_PA=T_SF/T_PA,
T_SH_T_PA=T_SH/T_PA,
T_GDP_T_PA= T_GDP/T_PA,#ground into double play
T_SB_T_PA=T_SB/T_PA,
T_CS_T_PA=T_CS/T_PA,
T_GB_T_PA = T_GB/T_PA, #Groundballs
T_FB_T_PA = T_FB/T_PA, #FlyBalls
T_LD_T_PA = T_LD/T_PA, #LineDrives
T_IFFB_T_PA = T_IFFB/T_PA, #Infield Fly balls
T_Pitches_T_PA= T_Pitches/T_PA,
T_Balls_T_PA= T_Balls/T_PA,
T_Strikes_T_PA= T_Strikes/T_PA,
T_IFH_T_PA= T_IFH/T_PA,
T_BU_T_PA= T_BU/T_PA,
T_BUH_T_PA= T_BUH/T_PA,
T_PH_T_PA= T_PH/T_PA,
T_Barrels_T_PA= T_Barrels/T_PA,
T_HardHits_T_PA= T_HardHit/T_PA
) %>% select(-(T_H:T_CS),-(T_GB:T_BUH),-T_PH,-T_Barrels,-T_HardHit,-T_Events) #Drop the old variables
#skim(FDG_Team5) %>% as_tibble()There are several ways to lag a dataset BY
GROUP.
* Dplyr way is here..
* The data.table (the method used below) is here.
#Note we will only be lagging the player level data, as the previous year's team performance shouldn't impact current performance
#Order the dataset by lag columns
pitcher_data5 = arrange(pitcher_data4, playerid,Season) #playerid is the Fangraph id assigned to each player
# Convert dataframe to data.table format
DT_pitcher = data.table(pitcher_data5)
#designate columns to lag - which is all of them
cols1 = colnames(pitcher_data5)
anscols = paste("lag", cols1, sep="_")
DT_pitcher[, (anscols) := data.table::shift(.SD, 1, NA, "lag"),by ='playerid', .SDcols=cols1] #Create 1 period lags by year
pitcher_data6 = as.data.frame(DT_pitcher) %>% select(-lag_playerid, -lag_Team, -lag_Season, -lag_Age,-lag_Name)
ncol(pitcher_data5) #251 - no lags[1] 251
ncol(pitcher_data6) #497 - lagged data ~ (251 * 2)-5[1] 497
We can use either the merge function or the SQL
functionality provided by the sqldf package to join the
lagged player level data to the Team level data
df_pitching_init = sqldf(
"
select a.*, b.*
from pitcher_data6 a
left join FDG_Team5 b
on a.Team = b.T_Team and a.Season = b.T_Season
"
) %>% select(-T_Team,-T_Season,-T_Age,-T_G,-T_AB)# Unncessary Team Variables
nrow(df_pitching_init) - nrow(pitcher_data6) #check if any rows are duplicated[1] 0
We can use Percentile based ranking to get rankings for players from the 2021 season.
Each player goes from a 0% to 100% on each percentile stat that is
used for creating a scoring opportunity. Data is not normalized by IP as
certain stats such as Wins will be worth more when we do.
#Categories I include are:
#Wins, Saves, WHIP, ERA, SOs, Holds
df_pitching_init2 = df_pitching_init %>%
# arrange(player_id,year) %>%
group_by(Season) %>%
mutate(
Wins_share = order(order(rank(W_IP,ties.method = 'average'),decreasing = FALSE))/n(),
SO_share = order(order(rank(SO_IP,ties.method = 'average'),decreasing = FALSE))/n(),
SV_share = order(order(rank(SV_IP,ties.method = 'average'),decreasing = FALSE))/n(),
WHIP_share = order(order(rank(WHIP,ties.method = 'average'),decreasing = FALSE))/n(),
ERA_share = order(order(rank(ERA,ties.method = 'average'),decreasing = FALSE))/n(),
HLD_share = 0,
Worth = Wins_share+SO_share+SV_share+WHIP_share+ERA_share+HLD_share
) %>%
ungroup() Chart of the Distribution of initial percentiles
As the chart below shows, the data is roughly normal.
skewness((df_pitching_init2$Worth))[1] 0.086
ggplot2::qplot(df_pitching_init2$Worth, main="Total Pitching Worth Dataset") + geom_histogram(colour="black", fill="steelblue")`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
min(df_pitching_init2$Worth)[1] 0.32
max(df_pitching_init2$Worth)[1] 4.6
ggpubr::ggqqplot(df_pitching_init2$Worth)
shapiro.test(df_pitching_init2$Worth)
Shapiro-Wilk normality test
data: df_pitching_init2$Worth
W = 1, p-value = 0.002
Total Rankings for the players (Using 5x5 Scoring) can be found here. While it looks like many of the top players have low worth scores, it is because we haven’t applied a modifier for IP yet. Wins are harder to come by relative to any other stat and require more innings pitched.
df_pitching_init2_raw = df_pitching_init %>%
# arrange(player_id,year) %>%
group_by(Season) %>%
mutate(
Wins_share_raw = order(order(rank(W,ties.method = 'average'),decreasing = FALSE))/n(),
SO_share_raw = order(order(rank(SO,ties.method = 'average'),decreasing = FALSE))/n(),
SV_share_raw = order(order(rank(SV,ties.method = 'average'),decreasing = FALSE))/n(),
WHIP_share = order(order(rank(WHIP,ties.method = 'average'),decreasing = FALSE))/n(),
ERA_share = order(order(rank(ERA,ties.method = 'average'),decreasing = FALSE))/n(),
HLD_share_raw = 0,
Worth = Wins_share_raw+SO_share_raw+SV_share_raw+WHIP_share+ERA_share+HLD_share_raw
) %>%
ungroup() %>%
select(-W,-SO,-SV,-WHIP,-ERA,-HLD)
options(digits=2)
df_pitching_init2021_raw =
df_pitching_init2_raw %>%
group_by(Name) %>%
filter(Season == 2021) %>%
arrange(desc(Worth)) %>%
select(Name,Wins_share_raw,SO_share_raw,SV_share_raw,WHIP_share,ERA_share,Worth)
df_pitching_init2021_raw %>%
filter (Worth>3.5) %>%
kbl() %>%
kable_material(c("striped", "hover","condensed","responsive"),full_width = F,fixed_thead = T)| Name | Wins_share_raw | SO_share_raw | SV_share_raw | WHIP_share | ERA_share | Worth |
|---|---|---|---|---|---|---|
| Daniel Bard | 0.79 | 0.69 | 0.97 | 0.87 | 0.75 | 4.0 |
| Garrett Richards | 0.79 | 0.84 | 0.86 | 0.87 | 0.68 | 4.0 |
| Jesus Luzardo | 0.77 | 0.78 | 0.55 | 0.89 | 0.92 | 3.9 |
| Brady Singer | 0.72 | 0.89 | 0.66 | 0.82 | 0.69 | 3.8 |
| Brad Keller | 0.86 | 0.85 | 0.31 | 0.92 | 0.77 | 3.7 |
| Jose Alvarado | 0.82 | 0.60 | 0.90 | 0.87 | 0.51 | 3.7 |
| Mitch Keller | 0.69 | 0.76 | 0.40 | 0.96 | 0.87 | 3.7 |
| Justus Sheffield | 0.81 | 0.56 | 0.37 | 0.98 | 0.95 | 3.7 |
| Nick Pivetta | 0.89 | 0.95 | 0.72 | 0.51 | 0.60 | 3.7 |
| Rafael Montero | 0.65 | 0.34 | 0.91 | 0.81 | 0.91 | 3.6 |
| Josh Fleming | 0.93 | 0.58 | 0.78 | 0.59 | 0.73 | 3.6 |
| Alec Mills | 0.74 | 0.73 | 0.71 | 0.69 | 0.73 | 3.6 |
| Joe Jimenez | 0.75 | 0.49 | 0.73 | 0.80 | 0.85 | 3.6 |
| Paul Fry | 0.58 | 0.52 | 0.83 | 0.80 | 0.87 | 3.6 |
| Wil Crowe | 0.61 | 0.82 | 0.53 | 0.84 | 0.78 | 3.6 |
| Erick Fedde | 0.82 | 0.87 | 0.38 | 0.70 | 0.78 | 3.5 |
| Adam Ottavino | 0.77 | 0.62 | 0.93 | 0.70 | 0.51 | 3.5 |
| Bryan Garcia | 0.50 | 0.22 | 0.85 | 0.98 | 0.98 | 3.5 |
| Alex Reyes | 0.92 | 0.77 | 0.98 | 0.58 | 0.25 | 3.5 |
While it looks like many of the top players have low worth scores, it is because we haven’t applied a modifier for IP yet.
options(digits=2)
df_pitching_init2021 =
df_pitching_init2 %>%
group_by(Name) %>%
filter(Season == 2021) %>%
arrange(desc(Worth)) %>%
select(Name,Wins_share,SO_share,SV_share,WHIP_share,ERA_share,HLD_share,Worth)
df_pitching_init2021 %>%
filter (Worth>2.9) %>%
kbl() %>%
kable_material(c("striped", "hover","condensed","responsive"),full_width = F,fixed_thead = T)| Name | Wins_share | SO_share | SV_share | WHIP_share | ERA_share | HLD_share | Worth |
|---|---|---|---|---|---|---|---|
| Daniel Bard | 0.93 | 0.85 | 0.97 | 0.87 | 0.75 | 0 | 4.4 |
| Joe Jimenez | 0.98 | 0.86 | 0.76 | 0.80 | 0.85 | 0 | 4.2 |
| Paul Fry | 0.83 | 0.87 | 0.82 | 0.80 | 0.87 | 0 | 4.2 |
| Rafael Dolis | 0.63 | 0.84 | 0.89 | 0.95 | 0.81 | 0 | 4.1 |
| Ben Bowden | 0.83 | 0.82 | 0.57 | 0.97 | 0.92 | 0 | 4.1 |
| Jose Alvarado | 0.96 | 0.85 | 0.89 | 0.87 | 0.51 | 0 | 4.1 |
| Tanner Rainey | 0.22 | 0.93 | 0.89 | 0.93 | 0.97 | 0 | 3.9 |
| Sean Newcomb | 0.62 | 0.93 | 0.79 | 0.93 | 0.65 | 0 | 3.9 |
| Ryan Hendrix | 1.00 | 0.73 | 0.51 | 0.82 | 0.85 | 0 | 3.9 |
| Adam Ottavino | 0.95 | 0.76 | 0.94 | 0.70 | 0.51 | 0 | 3.9 |
| Rafael Montero | 0.91 | 0.26 | 0.91 | 0.81 | 0.91 | 0 | 3.8 |
| Enyel De Los Santos | 0.55 | 0.94 | 0.44 | 0.95 | 0.90 | 0 | 3.8 |
| Bryan Garcia | 0.76 | 0.21 | 0.85 | 0.98 | 0.98 | 0 | 3.8 |
| Tanner Scott | 0.87 | 0.90 | 0.40 | 0.84 | 0.74 | 0 | 3.8 |
| Aroldis Chapman | 0.93 | 1.00 | 0.99 | 0.52 | 0.27 | 0 | 3.7 |
| Sean Poppen | 0.39 | 0.79 | 0.83 | 0.96 | 0.74 | 0 | 3.7 |
| Alex Reyes | 0.98 | 0.90 | 0.98 | 0.58 | 0.25 | 0 | 3.7 |
| James Karinchak | 0.97 | 0.96 | 0.94 | 0.35 | 0.47 | 0 | 3.7 |
| Bryan Abreu | 0.83 | 0.52 | 0.78 | 0.73 | 0.82 | 0 | 3.7 |
| Sean Reid-Foley | 0.90 | 0.89 | 0.36 | 0.77 | 0.75 | 0 | 3.7 |
| Jeurys Familia | 0.99 | 0.84 | 0.72 | 0.66 | 0.45 | 0 | 3.7 |
| Rex Brothers | 0.55 | 0.96 | 0.73 | 0.68 | 0.75 | 0 | 3.7 |
| Paul Campbell | 0.76 | 0.49 | 0.63 | 0.86 | 0.91 | 0 | 3.6 |
| Trevor Megill | 0.35 | 0.90 | 0.40 | 0.98 | 0.99 | 0 | 3.6 |
| Lucas Sims | 0.92 | 0.99 | 0.92 | 0.20 | 0.56 | 0 | 3.6 |
| Pete Fairbanks | 0.71 | 0.92 | 0.90 | 0.69 | 0.35 | 0 | 3.6 |
| Jesus Luzardo | 0.63 | 0.57 | 0.55 | 0.89 | 0.92 | 0 | 3.6 |
| Gregory Soto | 0.89 | 0.83 | 0.96 | 0.59 | 0.28 | 0 | 3.6 |
| David Hess | 0.91 | 0.32 | 0.32 | 1.00 | 1.00 | 0 | 3.5 |
| Amir Garrett | 0.04 | 0.89 | 0.92 | 0.83 | 0.86 | 0 | 3.5 |
| Phil Maton | 0.87 | 0.88 | 0.42 | 0.72 | 0.65 | 0 | 3.5 |
| Jacob Barnes | 0.26 | 0.78 | 0.87 | 0.73 | 0.88 | 0 | 3.5 |
| Matt Barnes | 0.94 | 0.98 | 0.98 | 0.21 | 0.41 | 0 | 3.5 |
| Ryan Helsley | 0.97 | 0.52 | 0.75 | 0.67 | 0.61 | 0 | 3.5 |
| Cesar Valdez | 0.36 | 0.47 | 0.93 | 0.91 | 0.84 | 0 | 3.5 |
| Trevor May | 0.94 | 0.92 | 0.86 | 0.43 | 0.35 | 0 | 3.5 |
| Jacob Webb | 0.98 | 0.45 | 0.79 | 0.79 | 0.50 | 0 | 3.5 |
| Brad Brach | 0.24 | 0.69 | 0.81 | 0.87 | 0.89 | 0 | 3.5 |
| Adam Morgan | 0.80 | 0.72 | 0.88 | 0.56 | 0.53 | 0 | 3.5 |
| Luis Oviedo | 0.24 | 0.62 | 0.63 | 0.99 | 0.99 | 0 | 3.5 |
| Glenn Otto | 0.10 | 0.83 | 0.61 | 0.93 | 0.99 | 0 | 3.5 |
| Sam Howard | 0.66 | 0.92 | 0.33 | 0.73 | 0.80 | 0 | 3.4 |
| Reid Detmers | 0.44 | 0.39 | 0.67 | 0.97 | 0.97 | 0 | 3.4 |
| Michael Feliz | 0.02 | 0.70 | 0.85 | 0.91 | 0.96 | 0 | 3.4 |
| Aaron Ashby | 0.90 | 0.86 | 0.80 | 0.28 | 0.61 | 0 | 3.4 |
| Griffin Canning | 0.80 | 0.51 | 0.54 | 0.75 | 0.80 | 0 | 3.4 |
| Brett de Geus | 0.59 | 0.21 | 0.65 | 0.95 | 0.98 | 0 | 3.4 |
| J.B. Wendelken | 0.87 | 0.33 | 0.83 | 0.78 | 0.54 | 0 | 3.4 |
| Matt Foster | 0.49 | 0.55 | 0.77 | 0.70 | 0.85 | 0 | 3.4 |
| Aaron Bummer | 0.86 | 0.92 | 0.81 | 0.44 | 0.33 | 0 | 3.4 |
| Tyler Chatwood | 0.22 | 0.80 | 0.80 | 0.72 | 0.81 | 0 | 3.4 |
| Kyle Finnegan | 0.75 | 0.56 | 0.93 | 0.74 | 0.34 | 0 | 3.3 |
| Sean Doolittle | 0.60 | 0.66 | 0.74 | 0.72 | 0.60 | 0 | 3.3 |
| Justus Sheffield | 0.85 | 0.17 | 0.37 | 0.98 | 0.95 | 0 | 3.3 |
| Anthony Kay | 0.20 | 0.79 | 0.59 | 0.92 | 0.81 | 0 | 3.3 |
| Chris Stratton | 0.86 | 0.67 | 0.90 | 0.50 | 0.37 | 0 | 3.3 |
| Josh Sborz | 0.67 | 0.78 | 0.72 | 0.67 | 0.45 | 0 | 3.3 |
| Matt Andriese | 0.33 | 0.59 | 0.74 | 0.88 | 0.75 | 0 | 3.3 |
| Jake Diekman | 0.45 | 0.95 | 0.90 | 0.56 | 0.43 | 0 | 3.3 |
| Justin Garza | 0.71 | 0.56 | 0.53 | 0.84 | 0.65 | 0 | 3.3 |
| Cionel Perez | 0.34 | 0.59 | 0.51 | 0.93 | 0.90 | 0 | 3.3 |
| Shane Greene | 0.01 | 0.57 | 0.83 | 0.91 | 0.96 | 0 | 3.3 |
| Edwin Uceta | 0.10 | 0.86 | 0.59 | 0.80 | 0.93 | 0 | 3.3 |
| Michael Rucker | 0.08 | 0.63 | 0.81 | 0.80 | 0.95 | 0 | 3.3 |
| J.D. Hammer | 0.47 | 0.70 | 0.51 | 0.88 | 0.70 | 0 | 3.3 |
| Jeffrey Springs | 0.95 | 0.96 | 0.83 | 0.20 | 0.30 | 0 | 3.2 |
| Jake Brentz | 0.78 | 0.80 | 0.80 | 0.47 | 0.38 | 0 | 3.2 |
| Hirokazu Sawamura | 0.89 | 0.77 | 0.68 | 0.71 | 0.19 | 0 | 3.2 |
| Spencer Howard | 0.11 | 0.62 | 0.65 | 0.88 | 0.97 | 0 | 3.2 |
| Heath Hembree | 0.25 | 0.96 | 0.92 | 0.30 | 0.79 | 0 | 3.2 |
| Devin Williams | 0.99 | 0.99 | 0.86 | 0.31 | 0.09 | 0 | 3.2 |
| Seth Lugo | 0.84 | 0.82 | 0.75 | 0.49 | 0.32 | 0 | 3.2 |
| Zac Lowther | 0.24 | 0.56 | 0.57 | 0.91 | 0.93 | 0 | 3.2 |
| Hansel Robles | 0.36 | 0.70 | 0.95 | 0.62 | 0.58 | 0 | 3.2 |
| Paul Sewald | 0.99 | 0.99 | 0.93 | 0.10 | 0.19 | 0 | 3.2 |
| Mychal Givens | 0.79 | 0.62 | 0.92 | 0.61 | 0.27 | 0 | 3.2 |
| Edwin Diaz | 0.80 | 0.96 | 0.99 | 0.13 | 0.31 | 0 | 3.2 |
| Max Kranick | 0.50 | 0.23 | 0.63 | 0.94 | 0.89 | 0 | 3.2 |
| Mike Mayers | 0.66 | 0.82 | 0.77 | 0.49 | 0.43 | 0 | 3.2 |
| Trevor Stephan | 0.42 | 0.81 | 0.71 | 0.66 | 0.57 | 0 | 3.2 |
| Carlos Estevez | 0.44 | 0.48 | 0.94 | 0.76 | 0.56 | 0 | 3.2 |
| Anthony Castro | 0.32 | 0.91 | 0.82 | 0.44 | 0.66 | 0 | 3.2 |
| Jackson Kowar | 0.10 | 0.44 | 0.62 | 1.00 | 1.00 | 0 | 3.2 |
| Brad Hand | 0.88 | 0.41 | 0.97 | 0.45 | 0.44 | 0 | 3.1 |
| Daniel Norris | 0.26 | 0.54 | 0.72 | 0.76 | 0.87 | 0 | 3.1 |
| Camilo Doval | 1.00 | 0.94 | 0.90 | 0.13 | 0.17 | 0 | 3.1 |
| Kyle Zimmer | 0.74 | 0.26 | 0.81 | 0.65 | 0.68 | 0 | 3.1 |
| Nick Mears | 0.36 | 0.50 | 0.66 | 0.90 | 0.71 | 0 | 3.1 |
| Conner Greene | 0.30 | 0.58 | 0.35 | 0.95 | 0.95 | 0 | 3.1 |
| Greg Holland | 0.52 | 0.43 | 0.91 | 0.57 | 0.68 | 0 | 3.1 |
| Daniel Lynch | 0.58 | 0.20 | 0.62 | 0.90 | 0.81 | 0 | 3.1 |
| Demarcus Evans | 0.10 | 0.87 | 0.60 | 0.81 | 0.74 | 0 | 3.1 |
| Edward Cabrera | 0.10 | 0.64 | 0.62 | 0.90 | 0.83 | 0 | 3.1 |
| Eduardo Rodriguez | 0.82 | 0.79 | 0.19 | 0.64 | 0.66 | 0 | 3.1 |
| Kyle Keller | 0.21 | 0.67 | 0.47 | 0.83 | 0.92 | 0 | 3.1 |
| Nick Pivetta | 0.57 | 0.74 | 0.68 | 0.51 | 0.60 | 0 | 3.1 |
| Joely Rodriguez | 0.36 | 0.54 | 0.75 | 0.81 | 0.63 | 0 | 3.1 |
| Joakim Soria | 0.17 | 0.66 | 0.92 | 0.61 | 0.72 | 0 | 3.1 |
| Austin Voth | 0.70 | 0.57 | 0.29 | 0.74 | 0.76 | 0 | 3.1 |
| Aaron Slegers | 0.64 | 0.20 | 0.30 | 0.98 | 0.95 | 0 | 3.1 |
| Andrew Heaney | 0.62 | 0.77 | 0.30 | 0.53 | 0.84 | 0 | 3.1 |
| Victor Gonzalez | 0.84 | 0.39 | 0.78 | 0.70 | 0.35 | 0 | 3.0 |
| Kevin Ginkel | 0.09 | 0.70 | 0.54 | 0.82 | 0.89 | 0 | 3.0 |
| Dylan Cease | 0.79 | 0.94 | 0.45 | 0.42 | 0.44 | 0 | 3.0 |
| Brooks Raley | 0.32 | 0.92 | 0.82 | 0.32 | 0.67 | 0 | 3.0 |
| Mitch Keller | 0.46 | 0.35 | 0.40 | 0.96 | 0.87 | 0 | 3.0 |
| Stefan Crichton | 0.05 | 0.11 | 0.93 | 0.99 | 0.96 | 0 | 3.0 |
| Garrett Richards | 0.49 | 0.24 | 0.75 | 0.87 | 0.68 | 0 | 3.0 |
| Anthony Misiewicz | 0.87 | 0.47 | 0.43 | 0.64 | 0.63 | 0 | 3.0 |
| Archie Bradley | 0.98 | 0.17 | 0.82 | 0.68 | 0.39 | 0 | 3.0 |
| Alex Claudio | 0.22 | 0.38 | 0.79 | 0.86 | 0.79 | 0 | 3.0 |
| Yimi Garcia | 0.69 | 0.60 | 0.96 | 0.25 | 0.51 | 0 | 3.0 |
| Liam Hendriks | 0.95 | 0.99 | 0.99 | 0.01 | 0.09 | 0 | 3.0 |
| Jordan Romano | 0.94 | 0.93 | 0.97 | 0.13 | 0.04 | 0 | 3.0 |
| Cody Poteet | 0.66 | 0.62 | 0.47 | 0.57 | 0.71 | 0 | 3.0 |
| Alex Colome | 0.61 | 0.31 | 0.96 | 0.64 | 0.49 | 0 | 3.0 |
| Brady Singer | 0.29 | 0.54 | 0.66 | 0.82 | 0.69 | 0 | 3.0 |
| Raisel Iglesias | 0.91 | 0.98 | 0.99 | 0.04 | 0.10 | 0 | 3.0 |
| Sam Coonrod | 0.42 | 0.75 | 0.84 | 0.53 | 0.47 | 0 | 3.0 |
| Bruce Zimmermann | 0.62 | 0.29 | 0.58 | 0.79 | 0.72 | 0 | 3.0 |
| Darwinzon Hernandez | 0.47 | 0.93 | 0.53 | 0.77 | 0.28 | 0 | 3.0 |
| Junior Guerra | 0.76 | 0.38 | 0.05 | 0.94 | 0.86 | 0 | 3.0 |
| Codi Heuer | 0.92 | 0.22 | 0.79 | 0.53 | 0.54 | 0 | 3.0 |
| Andrew Wantz | 0.28 | 0.95 | 0.66 | 0.40 | 0.70 | 0 | 3.0 |
| Hector Neris | 0.51 | 0.91 | 0.93 | 0.26 | 0.37 | 0 | 3.0 |
| Sam Hentges | 0.12 | 0.51 | 0.45 | 0.96 | 0.93 | 0 | 3.0 |
| Miguel Sanchez | 0.77 | 0.31 | 0.56 | 0.85 | 0.49 | 0 | 3.0 |
| Rowan Wick | 0.04 | 0.86 | 0.95 | 0.58 | 0.54 | 0 | 3.0 |
| Humberto Mejia | 0.10 | 0.33 | 0.60 | 0.98 | 0.96 | 0 | 3.0 |
| Jose Cisnero | 0.64 | 0.53 | 0.87 | 0.54 | 0.38 | 0 | 3.0 |
| Richard Lovelady | 0.91 | 0.74 | 0.85 | 0.15 | 0.32 | 0 | 3.0 |
| Brad Boxberger | 0.78 | 0.90 | 0.86 | 0.15 | 0.26 | 0 | 3.0 |
| Shane McClanahan | 0.81 | 0.76 | 0.62 | 0.46 | 0.31 | 0 | 3.0 |
| Phillips Valdez | 0.47 | 0.29 | 0.76 | 0.58 | 0.84 | 0 | 2.9 |
| Josh Fleming | 0.89 | 0.04 | 0.69 | 0.59 | 0.73 | 0 | 2.9 |
| Brusdar Graterol | 0.87 | 0.20 | 0.58 | 0.66 | 0.62 | 0 | 2.9 |
| Jose Quintana | 0.02 | 0.93 | 0.13 | 0.94 | 0.91 | 0 | 2.9 |
| Kyle Funkhouser | 0.92 | 0.37 | 0.71 | 0.65 | 0.30 | 0 | 2.9 |
| Daniel Ponce de Leon | 0.21 | 0.10 | 0.86 | 0.89 | 0.87 | 0 | 2.9 |
| JC Mejia | 0.13 | 0.33 | 0.59 | 0.88 | 0.99 | 0 | 2.9 |
| Wandy Peralta | 0.90 | 0.24 | 0.89 | 0.62 | 0.27 | 0 | 2.9 |
| Michael Fulmer | 0.72 | 0.61 | 0.94 | 0.47 | 0.17 | 0 | 2.9 |
| Alex Lange | 0.19 | 0.71 | 0.78 | 0.77 | 0.47 | 0 | 2.9 |
| Brad Keller | 0.59 | 0.32 | 0.31 | 0.92 | 0.77 | 0 | 2.9 |
| Shawn Armstrong | 0.18 | 0.84 | 0.17 | 0.77 | 0.94 | 0 | 2.9 |
| Braxton Garrett | 0.20 | 0.40 | 0.63 | 0.97 | 0.71 | 0 | 2.9 |
| Diego Castillo | 0.84 | 0.89 | 0.96 | 0.08 | 0.13 | 0 | 2.9 |
| Chris Rodriguez | 0.68 | 0.50 | 0.64 | 0.71 | 0.38 | 0 | 2.9 |
| Josiah Gray | 0.19 | 0.67 | 0.66 | 0.60 | 0.78 | 0 | 2.9 |
| Brandon Bielak | 0.59 | 0.36 | 0.74 | 0.63 | 0.59 | 0 | 2.9 |
Not all variables can be used for predictive modeling. Variables that
go into the percentile ranking or are non-normalized metrics created
after the fact (such as WAR - Wins above Replacement or
RS - Raw Run Support) should be removed. However, metrics
that are normalized by a per pitch basis (such as wFB/C)
can remain as we expect pitchers to have similar performance in these
metrics one year out.
#Be careful about RS - Run Support and RS/9
#Creating a new dataset to keep original intact
df_pitching_init3 = df_pitching_init2 %>%
select (-Name)Lagged Percentile (_share) Variables can be used for
predictive modeling. However since these variables were created for the
Worth metric they must also be removed for modeling purposes.
#Order the dataset by lag columns
df_pitching_init4 = arrange(df_pitching_init3, playerid,Season) #playerid is the Fangraph id assigned to each player
# Convert dataframe to data.table format
DT_pitcher2 = data.table(df_pitching_init4)
#designate columns to lag - just the new shares
cols1 = (c('Wins_share','SO_share','SV_share', 'ERA_share','WHIP_share','HLD_share','Worth'))
anscols = paste("lag", cols1, sep="_")
DT_pitcher2[, (anscols) := data.table::shift(.SD, 1, NA, "lag"),by ='playerid', .SDcols=cols1] #Create 1 period lags by year
df_pitching_final = as.data.frame(DT_pitcher2) %>%
select(-c(Wins_share,SO_share,SV_share, ERA_share,WHIP_share,HLD_share))%>%
select(-FIP,-(RAR:WPA),-(wFB:wCH),-(`ERA-`:`xFIP-`),
-SIERA,-(`RA9-WAR`:`Age Rng`),-kwERA,-(`wCH (pi)`:`wSL (pi)`),-(`K/9+`:`HR/FB%+`)) %>% select(-W,-SO,-SV,-HLD,-W_IP,-SO_IP,-SV_IP,-WHIP,-ERA,-HLD_IP)We split the data into Training Data (which is used to create the model) and test data (which is used to validate the model)
set.seed(15674) # For reproducibility
# Create index for testing and training data
inTrain <- createDataPartition(y = df_pitching_final$Worth, p = 0.80, list = FALSE)
# subset pitching data for training
tr_2021 <- df_pitching_final[inTrain,]
# subset the rest to test and validate trained model
te_2021 <- df_pitching_final[-inTrain,]
nrow(tr_2021)/nrow(df_pitching_final) #check if split is 0.8[1] 0.8
Vtreat Package in R is excellent for treating data before using for modeling. Additional documentation can be found here.
treat_plan_2021 <- vtreat::designTreatmentsZ(
dframe = tr_2021, # training data
varlist = colnames(tr_2021) %>% .[. != "hitting_score1"], # input variables = all training data columns, except random
codeRestriction = c("clean", "isBAD", "lev"), # derived variables types (drop cat_P)
verbose = FALSE) # suppress messages
#clean stands for cleaned numerical variable, isBAD indicates that a value replacement has occurred (which indicates a missing value in this case), and lev is a binary indicator whether a particular value of that categorical variable was present.
#### Checking Scoreframe
score_frame <- treat_plan_2021$scoreFrame %>%
select(varName, origName, code)
head(score_frame)
tr_treated_2021 <- vtreat::prepare(treat_plan_2021, tr_2021)
te_treated_2021 <- vtreat::prepare(treat_plan_2021, te_2021)
treat_plan_2021 <- vtreat::designTreatmentsZ(
dframe = DT_pitcher2, # training data
varlist = colnames(DT_pitcher2) %>% .[. != "hitting_score1"], # input variables = all training data columns, except random
codeRestriction = c("clean", "isBAD", "lev"), # derived variables types (drop cat_P)
verbose = FALSE) # suppress messages
total_treated_2021_pitching <- vtreat::prepare(treat_plan_2021, DT_pitcher2)
#tr_treated = tr
#te_treated = te
dim(tr_treated_2021) #note there are dummies for each player and team[1] 3196 1415
The population used for Training should be indicative of Total Population
ggplot2::qplot(tr_treated_2021$Worth, main="Training Set") + geom_histogram(colour="black", fill="steelblue") + theme_bw()`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#The skewness is actually a bit better than the overall data set
skewness(tr_treated_2021$Worth) [1] 0.077
To keep things simple with modeling, we’ll turn the training data
into simple input variables for caret::train, dropping the
response variable and converting the data frame to a matrix.
Documentation for this approach to XGboost can be found here.
Break the data set into x and y inputs with x being a matrix.
"_isBAD" is a category created by the Vtreat
package in case you want to identify rows
input_x <- as.matrix(((tr_treated_2021))%>%
select(-Worth) %>%
select(!ends_with ("_isBAD")))
input_y <- tr_treated_2021$WorthXGBoost with Default Hyperparameters:
The Variable Importance
(caret::varImp(xgb_base_2021, scale = F) from the caret
package shows the contribution of each variable to the initial model.
Since this is untuned, we can expect the percentage imporantance to
change as the models iterate through potential hyperparameters.
XGBoost documentation can be found for more general models here.
#Defaults for xgboost model
grid_default <- expand.grid(
nrounds = 100,
max_depth = 6,
eta = 0.3,
gamma = 0,
colsample_bytree = 1,
min_child_weight = 1,
subsample = 1
)
#This is a blank train_control set, this will be updated after
train_control <- caret::trainControl(
method = "none",
verboseIter = FALSE, # no training log
allowParallel = TRUE # FALSE for reproducible results
)
xgb_base_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = train_control,
tuneGrid = grid_default,
method = "xgbTree",
verbose = TRUE
)
caret::varImp(xgb_base_2021, scale = F )xgbTree variable importance
only 20 most important variables shown (out of 773)
A tune grid allows us to test a large amount of hyper-parameters and
find the model with the lowest RMSE for predictions.
However, The more values you want to test and the greater the amount of
Cross-Fold Validations (method = "cv"), the greater the
computational time it will take. More information on the specific
parameters can be found here.
# maximum number of trees
nrounds <- 1000
# note to start nrounds from 200, as smaller learning rates result in errors so
# big with lower starting points that they'll mess the scales
tune_grid <- expand.grid(
nrounds = seq(from = 100, to = nrounds, by = 50),
eta = c(0.01, 0.025, 0.05, 0.075, 0.1),
max_depth = c(2, 4, 6, 8, 10),
gamma = 0,
colsample_bytree = 1,
min_child_weight = 1,
subsample = 1
)
tune_control <- caret::trainControl(
method = "cv", # cross-validation
number = 5, # with n folds
## Note this was # out in the original code
#index = createFolds(tr_treated$Id_clean), # fix the folds
verboseIter = FALSE, # no training log
allowParallel = FALSE # FALSE for reproducible results
)Running the initial tuning model
#Note I will be timing these runs to give an estimate on how long this model takes to run
start_time <- Sys.time()
xgb_tune_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = tune_control,
tuneGrid = tune_grid,
method = "xgbTree",
verbose = FALSE
,verbosity = 0
)
end_time <- Sys.time()
end_time - start_timeTime difference of 21 mins
Tuning Plot and Variable Importance
varImp(xgb_tune_2021, scale = F ) xgbTree variable importance
only 20 most important variables shown (out of 765)
# helper function for the plots
tuneplot <- function(x, probs = .90) {
ggplot(x) +
coord_cartesian(ylim = c(quantile(x$results$RMSE, probs = probs), min(x$results$RMSE))) +
theme_bw()
}
tuneplot(xgb_tune_2021)After fixing the learning rate to the best tune from the previous iteration and we’ll also set maximum depth to 3 +-1 (or +2 if max_depth == 2) to experiment a bit around the suggested best tune in previous step. Then, well fix maximum depth and minimum child weight.
tune_grid2 <- expand.grid(
nrounds = seq(from = 50, to = nrounds, by = 50),
eta = xgb_tune_2021$bestTune$eta,
max_depth = ifelse(xgb_tune_2021$bestTune$max_depth == 2,
c(xgb_tune_2021$bestTune$max_depth:4),
xgb_tune_2021$bestTune$max_depth - 1:xgb_tune_2021$bestTune$max_depth + 1),
gamma = 0,
colsample_bytree = 1,
min_child_weight = c(1, 2, 3),
subsample = 1
)
xgb_tune2_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = tune_control,
tuneGrid = tune_grid2,
method = "xgbTree",
verbose = TRUE
)
tuneplot(xgb_tune2_2021)
xgb_tune2_2021$bestTune
varImp(xgb_tune2_2021, scale = F ) xgbTree variable importance
only 20 most important variables shown (out of 765)
tune_grid3 <- expand.grid(
nrounds = seq(from = 50, to = nrounds, by = 50),
eta = xgb_tune_2021$bestTune$eta,
max_depth = xgb_tune2_2021$bestTune$max_depth,
gamma = 0,
colsample_bytree = c(0.4, 0.6, 0.8, 1.0),
min_child_weight = xgb_tune2_2021$bestTune$min_child_weight,
subsample = c(0.5, 0.75, 1.0)
)
xgb_tune3_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = tune_control,
tuneGrid = tune_grid3,
method = "xgbTree",
verbose = TRUE
)
tuneplot(xgb_tune3_2021, probs = .95)
xgb_tune3_2021$bestTune
varImp(xgb_tune3_2021, scale = F ) xgbTree variable importance
only 20 most important variables shown (out of 765)
Next, we again pick the best values from previous step, and now will see whether changing the gamma has any effect on the model fit:
tune_grid4 <- expand.grid(
nrounds = seq(from = 50, to = nrounds, by = 50),
eta = xgb_tune_2021$bestTune$eta,
max_depth = xgb_tune2_2021$bestTune$max_depth,
gamma = c(0, 0.05,0.1, 0.2,0.4, 0.5, 0.7, 0.9, 1.0),
colsample_bytree = xgb_tune3_2021$bestTune$colsample_bytree,
min_child_weight = xgb_tune2_2021$bestTune$min_child_weight,
subsample = xgb_tune3_2021$bestTune$subsample
)
xgb_tune4_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = tune_control,
tuneGrid = tune_grid4,
method = "xgbTree",
verbose = TRUE
)
tuneplot(xgb_tune4_2021)Warning: The shape palette can deal with a maximum of 6 discrete values because more than 6 becomes difficult
to discriminate; you have 9. Consider specifying shapes manually if you must have them.
Warning: Removed 60 rows containing missing values (geom_point).
xgb_tune4_2021$bestTune
varImp(xgb_tune4_2021, scale = F ) xgbTree variable importance
only 20 most important variables shown (out of 765)
Now, we have tuned the hyperparameters and can start reducing the learning rate to get to the final model:
start_time <- Sys.time()
tune_grid5 <- expand.grid(
nrounds = seq(from = 100, to = 10000, by = 75),
eta = c(0.01, 0.015, 0.025,0.035, 0.05,0.75, 0.1),
max_depth = xgb_tune2_2021$bestTune$max_depth,
gamma = xgb_tune4_2021$bestTune$gamma,
colsample_bytree = xgb_tune3_2021$bestTune$colsample_bytree,
min_child_weight = xgb_tune2_2021$bestTune$min_child_weight,
subsample = xgb_tune3_2021$bestTune$subsample
)
xgb_tune5_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = tune_control,
tuneGrid = tune_grid5,
method = "xgbTree",
verbose = TRUE
)
#tuneplot(xgb_tune5_2021)
end_time <- Sys.time()
end_time - start_timeTime difference of 28 mins
xgb_tune5_2021$bestTune
varImp(xgb_tune5_2021, scale = F ) xgbTree variable importance
only 20 most important variables shown (out of 765)
(final_grid_2021 <- expand.grid(
nrounds = xgb_tune5_2021$bestTune$nrounds,
eta = xgb_tune5_2021$bestTune$eta,
max_depth = xgb_tune5_2021$bestTune$max_depth,
gamma = xgb_tune5_2021$bestTune$gamma,
colsample_bytree = xgb_tune5_2021$bestTune$colsample_bytree,
min_child_weight = xgb_tune5_2021$bestTune$min_child_weight,
subsample = xgb_tune5_2021$bestTune$subsample
))
(xgb_model_2021 <- caret::train(
x = input_x,
y = input_y,
trControl = train_control,
tuneGrid = final_grid_2021,
method = "xgbTree",
verbose = TRUE
))eXtreme Gradient Boosting
3196 samples
765 predictor
No pre-processing
Resampling: None
varImp(xgb_model_2021, scale = F ) xgbTree variable importance
only 20 most important variables shown (out of 765)
We don’t need to look too closely at are training data as Xgboost will heavily overfit the model based on that data. The more important part is how the model performs on in predicting our Test Sample that was not included.
y_pred_test <- predict(xgb_model_2021, data.matrix(te_treated_2021))
test_stats= cbind((te_treated_2021$Worth),y_pred_test)
test_statsR2 = cor(test_stats[,1],test_stats[,2])^2
print(test_statsR2)[1] 0.71
y_pred_train <- predict(xgb_model_2021, data.matrix(tr_treated_2021))
train_stats = cbind((tr_treated_2021$Worth),y_pred_train)
train_statsR2 = cor(train_stats[,1],train_stats[,2])^2
print(train_statsR2)[1] 0.98
#test dataset
x <- select(te_treated_2021, -Worth)
y <- (te_treated_2021$Worth)
(xgb_model_rmse <- ModelMetrics::rmse(y, predict(xgb_model_2021, newdata = x)))[1] 0.34
holdout_x <- select(tr_treated_2021, -Worth)
holdout_y <- tr_treated_2021$Worth
(xgb_model_rmse <- ModelMetrics::rmse(holdout_y, predict(xgb_model_2021, newdata = holdout_x)))[1] 0.081
ggplot2::ggplot() +
aes(x = test_stats[,1], y = test_stats[,2]) +
geom_jitter() +
xlab("Predicted Values") +
ylab("Actual Values") +
ggtitle("Results of Pitching Model on Test Data")+
theme(plot.title = element_text(hjust = 0.5,size = 22,color ="steel blue"))+
geom_smooth(method = "lm")`geom_smooth()` using formula 'y ~ x'
Now that we have an acceptable model, we can use it to create projections for how well we think players should do in 2022 based on their hitting statistics in 2021. First let’s reduce
Step 1: Only keep variables with high enough importance in model
vip(xgb_model_2021, num_features = 30) # 10 is the default, 30 gives a visual on the top 30 most important features of the model
unscalevi = vi(xgb_model_2021, method="model") #shows the numbers behind the plot
unscalevi$Importance_perc = with(unscalevi,Importance/sum(Importance)) #adds percentages
unscalevi # importance by variables
variables_to_keep_2021 = subset(unscalevi, Importance_perc > 0.0010) %>% select(Variable) #Keep Variables that explain at least a small amount [0.1%] of the model. This is a low threshold for inclusion ,but you can adjust this
variables_to_keep_2021b = t(variables_to_keep_2021)
variables_to_keep_2022 = colnames(row_to_names(variables_to_keep_2021b,row_number = 1))
tr_treated_2022 = tr_treated_2021 %>% select(Worth,one_of(variables_to_keep_2022),starts_with("Team_lev_x_")) #keep modeled important variables along with team indicator variables
te_treated_2022 = te_treated_2021 %>% select(Worth,one_of(variables_to_keep_2022),starts_with("Team_lev_x_"))
input_x_2022 = as.matrix(select(tr_treated_2022, -Worth))
input_y_2022 = tr_treated_2022$WorthStep 2: Re-fit model with reduced variable scope
Note from the best tune below the nrounds - is the max I
set above and eta is at the lowest possible value. This
could cause potential overfitting issues, but from our Actual
vs. Predicted Graph we know this not to be the case.
(final_grid_2021 <- expand.grid(
nrounds = xgb_tune5_2021$bestTune$nrounds,
eta = xgb_tune5_2021$bestTune$eta,
max_depth = xgb_tune5_2021$bestTune$max_depth,
gamma = xgb_tune5_2021$bestTune$gamma,
colsample_bytree = xgb_tune5_2021$bestTune$colsample_bytree,
min_child_weight = xgb_tune5_2021$bestTune$min_child_weight,
subsample = xgb_tune5_2021$bestTune$subsample
))
(xgb_model_2022 <- caret::train(
x = input_x_2022,
y = input_y_2022,
trControl = train_control,
tuneGrid = final_grid_2021,
method = "xgbTree",
verbose = TRUE
))eXtreme Gradient Boosting
3196 samples
125 predictor
No pre-processing
Resampling: None
vip(xgb_model_2022, num_features = 30)
unscalevi24 = vi(xgb_model_2022, method="model")
unscalevi24$Importance_perc = with(unscalevi24,Importance/sum(Importance))
unscalevi24
# Save work for later prediction
save(xgb_model_2022,file = '2022_Pitching5x5_Model.Rdata')
pitching5x5 = xgb_model_2022
pitchinginput = input_x_2022First let’s prepare a file for predicting based on our model object
variableslag5x= row_to_names(as.data.frame(t(variables_to_keep_2022)),row_number = 1) %>% select (starts_with("lag"))
variables_nolag5x = (owmr::remove_prefix(variableslag5x,"lag" , sep = "_"))
Data_Predict_2022a5x = total_treated_2021_pitching %>% select (one_of(colnames(variables_nolag5x)),Season,playerid)
colnames(Data_Predict_2022a5x) <- paste0("lag_", colnames(Data_Predict_2022a5x))
Data_Predict_2022b5x = total_treated_2021_pitching %>% select (one_of(colnames(variables_nolag5x)))
colnames(Data_Predict_2022b5x) = colnames(variableslag5x)
variables_to_keep_2022_nolag5x = total_treated_2021_pitching %>% select(one_of(variables_to_keep_2022),Season,playerid,starts_with("Team_lev_x_"))%>% select(-one_of(colnames(Data_Predict_2022b5x)))
Data_predict_20225x = sqldf(
"
select a.*,b.* from
Data_Predict_2022a5x a,
variables_to_keep_2022_nolag5x b
on b.playerid = a.lag_playerid
and b.Season = a.lag_Season
"
) %>% select(-lag_playerid,lag_Season) %>%
filter(Season == 2021) %>%
select(one_of(variables_to_keep_2022),starts_with("Team_lev_x_"))This is the raw prediction score per IP for each pitcher
pitching_predictions5x = as.data.frame(predict(xgb_model_2022,Data_predict_20225x))
names(pitching_predictions5x) = c("Predict_Score")
Data_predict_2022_w_Pitching_Predictions5x = cbind(Data_predict_2022,pitching_predictions5x) %>% select(playerid,Predict_Score)
head(Data_predict_2022_w_Pitching_Predictions5x)NADownloaded from FanGraphs here.
Latest_2022_pitchingdata_FP = read_csv("FanGraph_Fantasy_Baseball_Pitching.csv")Rows: 817 Columns: 27
-- Column specification -------------------------------------------------------------------------------------------
Delimiter: ","
chr (3): Name, Team, playerid
dbl (24): GS, G, IP, W, L, QS, SV, HLD, H, ER, HR, SO, BB, WHIP, K/9, BB/9, ERA, FIP, WAR, RA9-WAR, ADP, InterS...
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Latest_2022_pitchingdata_FPNAAs you can see from the chart below there aren’t many elite pitchers in the 87+ Predict score range.
Pitching_Data_NonAdj_Projections5x = sqldf(
"
select a.*,b.Predict_Score
from Latest_2022_pitchingdata_FP a
left join
Data_predict_2022_w_Pitching_Predictions5x b
on a.playerid = b.playerid
"
) %>% filter(ADP<370 | is.na(Predict_Score)==F)
Pitching_Data_Adj_Projections5x =
Pitching_Data_NonAdj_Projections5x %>%
mutate(
Avg_IP = 60,
AdjPredict_Score_raw = ifelse(is.na(Predict_Score),NA,Predict_Score*(IP/Avg_IP)),
max_predscore= max(AdjPredict_Score_raw,na.rm = T),
AdjPredict_Score = ifelse (is.na(AdjPredict_Score_raw),NA,AdjPredict_Score_raw *100/max_predscore),
WAR_rank = order(order(rank(WAR,ties.method = 'average'),decreasing = TRUE)),
AdjPredict_Score_Rank = order(order(rank(AdjPredict_Score,ties.method = 'average'),decreasing = TRUE))-sum(is.na(AdjPredict_Score)),
Ranks_Above_ADP = ADP - AdjPredict_Score_Rank
) %>% select (Name,ADP,WAR, WAR_rank,AdjPredict_Score ,AdjPredict_Score_Rank,Ranks_Above_ADP)
ggplot2::qplot(Pitching_Data_Adj_Projections5x$AdjPredict_Score, main="Predictions") + geom_histogram(colour="black", fill="grey") + theme_bw()`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
AdjPredict_Score are normalized to 100
tableexport =
Pitching_Data_Adj_Projections5x %>%
arrange (ADP,WAR) %>%
kbl() %>%
kable_material(c("striped", "hover","condensed","responsive"),full_width = F,fixed_thead = T)
save_kable(tableexport,file = "Pitching5x5.html")
#tableexportThis is a better formatted Table
ft_dt <- Pitching_Data_Adj_Projections5x[1:nrow(Pitching_Data_Adj_Projections5x), 1:ncol(Pitching_Data_Adj_Projections5x)] %>%
filter(AdjPredict_Score_Rank>0)%>% arrange((AdjPredict_Score_Rank))
ft_dt$ADP <- color_tile("white", "red")(ft_dt$ADP)
ft_dt$WAR <- color_bar("lightblue")(ft_dt$WAR)
ft_dt$AdjPredict_Score<- color_bar("lightblue")(ft_dt$AdjPredict_Score)
ft_dt$WAR_Rank <- color_tile("green","orange")(ft_dt$WAR_rank)
ft_dt$Predict_Rank <- color_tile("green","orange")(ft_dt$AdjPredict_Score_Rank)
ft_dt$Ranks_Above_ADP <-
ifelse(
ft_dt$Ranks_Above_ADP < 0,
cell_spec(round(ft_dt$Ranks_Above_ADP,2), color = "red", italic = T),
cell_spec(round(ft_dt$Ranks_Above_ADP,2), color = "green", italic = T)
)
ft_dt2 <- ft_dt[c("Name", "ADP", "WAR", "AdjPredict_Score", "WAR_Rank","Predict_Rank","Ranks_Above_ADP")]
table_export =
kbl(ft_dt2, escape = F) %>%
kable_material(c("striped", "hover","condensed","responsive"),full_width = F,fixed_thead = T) %>% column_spec(6, width = "3cm") %>%
add_header_above(c(" ", "Scores" = 3, "Ranks" = 2," "))
save_kable(table_export,file = "Pitching5x5_updated.html")
table_export Scores |
Ranks |
|||||
|---|---|---|---|---|---|---|
| Name | ADP | WAR | AdjPredict_Score | WAR_Rank | Predict_Rank | Ranks_Above_ADP |
| Dylan Cease | 81.2 | 3.3 | 100.0 | 20 | 1 | 80.2 |
| Eduardo Rodriguez | 150.1 | 3.5 | 99.8 | 15 | 2 | 148.1 |
| Shane Bieber | 31.2 | 4.4 | 93.2 | 4 | 3 | 28.2 |
| Aaron Nola | 41.0 | 4.2 | 89.8 | 7 | 4 | 37 |
| Nick Pivetta | 350.8 | 1.7 | 88.7 | 86 | 5 | 345.8 |
| Tarik Skubal | 197.6 | 2.0 | 88.0 | 61 | 6 | 191.6 |
| Germán Márquez | 261.7 | 2.9 | 88.0 | 29 | 7 | 254.7 |
| Tyler Mahle | 132.2 | 2.6 | 87.8 | 39 | 8 | 124.2 |
| Brady Singer | 492.4 | 2.2 | 87.8 | 53 | 9 | 483.4 |
| Jon Gray | 236.9 | 2.3 | 87.5 | 51 | 10 | 226.9 |
| Shane McClanahan | 109.6 | 2.6 | 86.3 | 40 | 11 | 98.6 |
| Brad Keller | 582.1 | 1.6 | 85.3 | 91 | 12 | 570.1 |
| Luis Castillo | 103.2 | 3.3 | 84.9 | 21 | 13 | 90.2 |
| Zac Gallen | 154.3 | 2.3 | 84.8 | 50 | 14 | 140.3 |
| Josiah Gray | 290.7 | 1.6 | 84.3 | 89 | 15 | 275.7 |
| José Quintana | 575.5 | 1.0 | 84.2 | 143 | 16 | 559.5 |
| Patrick Corbin | 456.0 | 1.6 | 82.6 | 95 | 17 | 439 |
| Alek Manoah | 93.0 | 2.8 | 82.0 | 33 | 18 | 75 |
| Gerrit Cole | 7.1 | 5.0 | 81.9 | 3 | 19 | -11.9 |
| Yusei Kikuchi | 301.3 | 1.7 | 81.1 | 83 | 20 | 281.3 |
| Sean Manaea | 141.4 | 2.8 | 80.6 | 31 | 21 | 120.4 |
| Logan Gilbert | 161.8 | 2.2 | 80.0 | 52 | 22 | 139.8 |
| Lucas Giolito | 44.2 | 3.7 | 79.5 | 10 | 23 | 21.2 |
| Logan Webb | 69.9 | 3.7 | 79.1 | 11 | 24 | 45.9 |
| Andrew Heaney | 299.4 | 1.4 | 78.9 | 102 | 25 | 274.4 |
| Blake Snell | 113.4 | 2.4 | 78.6 | 49 | 26 | 87.4 |
| Frankie Montas | 92.8 | 3.3 | 78.6 | 19 | 27 | 65.8 |
| Corbin Burnes | 10.8 | 5.3 | 77.9 | 1 | 28 | -17.2 |
| Freddy Peralta | 54.0 | 3.6 | 77.9 | 12 | 29 | 25 |
| Kyle Hendricks | 279.8 | 1.7 | 77.6 | 79 | 30 | 249.8 |
| Jesús Luzardo | 289.5 | 0.8 | 76.2 | 158 | 31 | 258.5 |
| Walker Buehler | 18.6 | 3.8 | 75.6 | 8 | 32 | -13.4 |
| Framber Valdez | 149.6 | 3.1 | 75.0 | 26 | 33 | 116.6 |
| Aaron Civale | 269.5 | 1.5 | 74.8 | 96 | 34 | 235.5 |
| José Berríos | 77.5 | 3.2 | 74.6 | 23 | 35 | 42.5 |
| Steven Matz | 249.2 | 1.9 | 74.5 | 67 | 36 | 213.2 |
| Luis Garcia | 144.7 | 2.0 | 74.4 | 64 | 37 | 107.7 |
| Robbie Ray | 49.9 | 3.0 | 74.2 | 27 | 38 | 11.9 |
| Kevin Gausman | 71.4 | 3.2 | 73.8 | 24 | 39 | 32.4 |
| Ian Anderson | 151.1 | 2.1 | 73.5 | 56 | 40 | 111.1 |
| Sonny Gray | 168.2 | 2.7 | 73.3 | 38 | 41 | 127.2 |
| Yu Darvish | 100.3 | 2.7 | 72.2 | 34 | 42 | 58.3 |
| Kyle Gibson | 393.9 | 1.9 | 72.1 | 71 | 43 | 350.9 |
| Cole Irvin | 532.2 | 1.5 | 71.8 | 97 | 44 | 488.2 |
| Nathan Eovaldi | 135.1 | 3.6 | 71.6 | 13 | 45 | 90.1 |
| Trevor Rogers | 92.9 | 3.1 | 71.2 | 25 | 46 | 46.9 |
| Mitch Keller | 531.5 | 1.2 | 71.0 | 123 | 47 | 484.5 |
| Triston McKenzie | 229.0 | 1.6 | 70.7 | 93 | 48 | 181 |
| Julio Urías | 34.7 | 3.5 | 70.5 | 14 | 49 | -14.3 |
| Jordan Montgomery | 208.7 | 2.7 | 70.4 | 35 | 50 | 158.7 |
| Patrick Sandoval | 196.0 | 2.4 | 69.9 | 46 | 51 | 145 |
| Zac Lowther | 999.0 | 0.8 | 69.2 | 159 | 52 | 947 |
| Antonio Senzatela | 585.0 | 2.0 | 68.9 | 62 | 53 | 532 |
| JT Brubaker | 537.7 | 1.4 | 68.3 | 105 | 54 | 483.7 |
| Kris Bubic | 536.0 | 1.1 | 68.1 | 125 | 55 | 481 |
| Bruce Zimmermann | 600.8 | 1.3 | 67.9 | 114 | 56 | 544.8 |
| Shohei Ohtani | 9.2 | 2.8 | 67.9 | 32 | 57 | -47.8 |
| Dylan Bundy | 498.5 | 1.3 | 67.7 | 109 | 58 | 440.5 |
| Luis Patiño | 290.7 | 1.5 | 67.5 | 99 | 59 | 231.7 |
| Mike Minor | 532.2 | 2.0 | 67.5 | 63 | 60 | 472.2 |
| Kyle Freeland | 574.4 | 1.5 | 67.4 | 98 | 61 | 513.4 |
| Max Fried | 69.8 | 3.5 | 67.3 | 16 | 62 | 7.8 |
| Zach Plesac | 346.6 | 1.5 | 66.8 | 100 | 63 | 283.6 |
| Dane Dunning | 484.6 | 1.9 | 66.7 | 72 | 64 | 420.6 |
| Hyun Jin Ryu | 207.3 | 2.4 | 66.6 | 45 | 65 | 142.3 |
| Alex Cobb | 242.9 | 1.9 | 66.4 | 70 | 66 | 176.9 |
| Brandon Woodruff | 21.2 | 4.2 | 66.3 | 6 | 67 | -45.8 |
| Tanner Houck | 206.4 | 2.4 | 66.2 | 47 | 68 | 138.4 |
| Joe Ryan | 221.5 | 2.2 | 66.1 | 54 | 69 | 152.5 |
| Jordan Lyles | 578.9 | 0.6 | 65.8 | 196 | 70 | 508.9 |
| Joe Musgrove | 70.6 | 3.2 | 65.4 | 22 | 71 | -0.4 |
| Max Scherzer | 19.7 | 3.8 | 65.1 | 9 | 72 | -52.3 |
| Dallas Keuchel | 567.3 | 1.2 | 64.9 | 119 | 73 | 494.3 |
| Alex Wood | 219.9 | 1.7 | 64.8 | 78 | 74 | 145.9 |
| Carlos Hernández | 412.5 | 1.2 | 64.4 | 121 | 75 | 337.5 |
| Bailey Ober | 261.3 | 2.2 | 64.4 | 55 | 76 | 185.3 |
| Chris Flexen | 413.6 | 1.9 | 64.0 | 73 | 77 | 336.6 |
| Michael Kopech | 161.1 | 2.5 | 63.5 | 43 | 78 | 83.1 |
| Matt Manning | 543.4 | 1.1 | 63.1 | 124 | 79 | 464.4 |
| Sandy Alcantara | 43.4 | 3.3 | 63.0 | 18 | 80 | -36.6 |
| Huascar Ynoa | 245.8 | 1.8 | 63.0 | 76 | 81 | 164.8 |
| Zack Wheeler | 31.0 | 4.4 | 62.9 | 5 | 82 | -51 |
| Cristian Javier | 284.0 | 1.2 | 62.8 | 118 | 83 | 201 |
| Charlie Morton | 94.3 | 2.9 | 62.5 | 30 | 84 | 10.3 |
| Austin Gomber | 551.9 | 1.2 | 62.4 | 116 | 85 | 466.9 |
| Reid Detmers | 427.4 | 1.1 | 62.4 | 134 | 86 | 341.4 |
| Lance Lynn | 70.0 | 3.4 | 62.0 | 17 | 87 | -17 |
| Keegan Akin | 600.0 | 1.0 | 61.8 | 139 | 88 | 512 |
| Chris Bassitt | 128.5 | 2.4 | 61.7 | 48 | 89 | 39.5 |
| Ranger Suárez | 179.3 | 2.4 | 61.2 | 44 | 90 | 89.3 |
| Carlos Rodón | 110.2 | 2.9 | 61.2 | 28 | 91 | 19.2 |
| Merrill Kelly | 526.0 | 1.7 | 60.8 | 81 | 92 | 434 |
| Spencer Howard | 575.8 | 0.9 | 60.6 | 150 | 93 | 482.8 |
| Vladimir Gutierrez | 585.4 | 0.2 | 59.4 | 322 | 94 | 491.4 |
| Corey Kluber | 324.4 | 1.1 | 58.9 | 130 | 95 | 229.4 |
| Cal Quantrill | 267.3 | 1.7 | 58.9 | 82 | 96 | 171.3 |
| Carlos Carrasco | 284.6 | 1.3 | 58.6 | 110 | 97 | 187.6 |
| Casey Mize | 279.4 | 1.7 | 58.5 | 84 | 98 | 181.4 |
| Zach Davies | 595.9 | 0.2 | 58.4 | 343 | 99 | 496.9 |
| James Kaprielian | 448.3 | 1.1 | 58.4 | 126 | 100 | 348.3 |
| Aaron Ashby | 264.1 | 1.5 | 58.2 | 101 | 101 | 163.1 |
| Ryan Yarbrough | 541.5 | 1.4 | 58.0 | 106 | 102 | 439.5 |
| Marco Gonzales | 319.6 | 1.7 | 57.9 | 80 | 103 | 216.6 |
| Erick Fedde | 589.7 | 0.6 | 57.8 | 188 | 104 | 485.7 |
| Pablo López | 140.6 | 2.7 | 57.6 | 36 | 105 | 35.6 |
| Tony Gonsolin | 300.6 | 0.9 | 57.4 | 148 | 106 | 194.6 |
| J.A. Happ | 597.0 | 0.6 | 57.4 | 199 | 107 | 490 |
| Elieser Hernandez | 339.6 | 1.0 | 57.3 | 145 | 108 | 231.6 |
| Eric Lauer | 306.3 | 1.4 | 57.3 | 103 | 109 | 197.3 |
| José Suarez | 535.1 | 1.3 | 57.2 | 113 | 110 | 425.1 |
| Taylor Hearn | 591.2 | 0.7 | 56.8 | 183 | 111 | 480.2 |
| Wade Miley | 502.0 | 1.2 | 56.5 | 120 | 112 | 390 |
| Marcus Stroman | 187.5 | 2.6 | 56.5 | 41 | 113 | 74.5 |
| Alec Mills | 594.1 | 0.6 | 55.6 | 211 | 114 | 480.1 |
| Chris Sale | 85.9 | 1.9 | 55.4 | 66 | 115 | -29.1 |
| Anthony DeSclafani | 212.6 | 1.8 | 54.9 | 77 | 116 | 96.6 |
| Jack Flaherty | 115.4 | 1.6 | 54.8 | 87 | 117 | -1.6 |
| Jameson Taillon | 299.6 | 2.1 | 54.7 | 59 | 118 | 181.6 |
| Clayton Kershaw | 149.4 | 2.5 | 54.2 | 42 | 119 | 30.4 |
| José Urquidy | 212.1 | 2.0 | 54.1 | 65 | 120 | 92.1 |
| Taijuan Walker | 414.5 | 0.6 | 53.7 | 185 | 121 | 293.5 |
| Madison Bumgarner | 507.1 | 1.1 | 53.7 | 131 | 122 | 385.1 |
| Michael Lorenzen | 543.3 | 0.6 | 53.4 | 203 | 123 | 420.3 |
| Zach Eflin | 485.0 | 1.9 | 53.3 | 69 | 124 | 361 |
| Jacob deGrom | 19.7 | 5.1 | 53.0 | 2 | 125 | -105.3 |
| Tylor Megill | 334.2 | 1.1 | 52.7 | 129 | 126 | 208.2 |
| Bryse Wilson | 600.5 | 0.5 | 52.3 | 223 | 127 | 473.5 |
| Chris Paddack | 436.8 | 1.2 | 52.2 | 117 | 128 | 308.8 |
| Wil Crowe | 600.8 | 0.1 | 51.9 | 379 | 129 | 471.8 |
| Drew Rasmussen | 282.4 | 1.6 | 51.5 | 88 | 130 | 152.4 |
| Stephen Strasburg | 294.0 | 1.3 | 51.5 | 111 | 131 | 163 |
| Drew Smyly | 577.4 | 0.6 | 51.3 | 208 | 132 | 445.4 |
| Daniel Bard | 579.3 | 0.4 | 51.1 | 266 | 133 | 446.3 |
| Adrian Houser | 465.0 | 1.4 | 50.5 | 104 | 134 | 331 |
| Luke Weaver | 514.4 | 1.1 | 50.3 | 128 | 135 | 379.4 |
| Martín Pérez | 598.8 | 0.8 | 49.4 | 168 | 136 | 462.8 |
| John Means | 217.1 | 2.7 | 49.2 | 37 | 137 | 80.1 |
| Daniel Lynch | 559.5 | 0.7 | 48.9 | 174 | 138 | 421.5 |
| Michael Wacha | 549.7 | 0.9 | 48.9 | 149 | 139 | 410.7 |
| A.J. Alexy | 596.2 | 0.6 | 48.5 | 186 | 140 | 456.2 |
| Edward Cabrera | 523.6 | 0.6 | 47.9 | 202 | 141 | 382.6 |
| Glenn Otto | 582.9 | 0.9 | 47.2 | 152 | 142 | 440.9 |
| José Alvarado | 587.6 | 0.5 | 47.2 | 237 | 143 | 444.6 |
| Adam Wainwright | 194.7 | 2.1 | 46.9 | 58 | 144 | 50.7 |
| Zack Greinke | 312.8 | 1.6 | 46.7 | 90 | 145 | 167.8 |
| Adam Ottavino | 598.1 | 0.2 | 46.3 | 328 | 146 | 452.1 |
| James Karinchak | 468.6 | 0.7 | 45.3 | 171 | 147 | 321.6 |
| Nestor Cortes | 333.1 | 1.4 | 45.3 | 107 | 148 | 185.1 |
| Randy Dobnak | 600.4 | 0.8 | 45.3 | 162 | 149 | 451.4 |
| Aroldis Chapman | 83.3 | 0.9 | 45.2 | 151 | 150 | -66.7 |
| Rich Hill | 477.4 | 0.7 | 44.9 | 173 | 151 | 326.4 |
| Garrett Whitlock | 239.2 | 1.2 | 44.7 | 122 | 152 | 87.2 |
| Johnny Cueto | 585.8 | 0.3 | 44.7 | 283 | 153 | 432.8 |
| Justus Sheffield | 599.5 | 0.4 | 44.0 | 269 | 154 | 445.5 |
| Michael Pineda | 465.2 | 1.4 | 43.9 | 108 | 155 | 310.2 |
| Josh Fleming | 600.8 | 0.5 | 43.8 | 222 | 156 | 444.8 |
| Trevor May | 565.3 | 0.4 | 43.6 | 239 | 157 | 408.3 |
| Kyle Finnegan | 351.5 | 0.2 | 43.4 | 336 | 158 | 193.5 |
| Matt Barnes | 245.2 | 0.8 | 43.2 | 164 | 159 | 86.2 |
| Tyler Anderson | 564.5 | 0.7 | 43.0 | 172 | 160 | 404.5 |
| Paul Sewald | 289.0 | 0.5 | 43.0 | 234 | 161 | 128 |
| Jake Diekman | 585.9 | 0.5 | 42.9 | 232 | 162 | 423.9 |
| Héctor Neris | 493.9 | 0.4 | 42.9 | 242 | 163 | 330.9 |
| Tanner Rainey | 371.9 | 0.2 | 42.8 | 325 | 164 | 207.9 |
| Gregory Soto | 194.3 | 0.5 | 42.6 | 228 | 165 | 29.3 |
| Miles Mikolas | 512.8 | 1.6 | 42.5 | 94 | 166 | 346.8 |
| Amir Garrett | 586.6 | 0.2 | 42.5 | 333 | 167 | 419.6 |
| Phil Maton | 600.4 | 0.4 | 42.2 | 248 | 168 | 432.4 |
| Dillon Peters | 600.7 | 0.0 | 42.1 | 421 | 169 | 431.7 |
| Scott Barlow | 159.3 | 0.9 | 41.8 | 155 | 170 | -10.7 |
| Carlos Estévez | 528.6 | 0.2 | 41.7 | 337 | 171 | 357.6 |
| David Peterson | 595.2 | 0.7 | 41.6 | 175 | 172 | 423.2 |
| Lance McCullers Jr. | 260.0 | 1.7 | 41.4 | 85 | 173 | 87 |
| Pete Fairbanks | 517.3 | 0.6 | 40.9 | 194 | 174 | 343.3 |
| Lucas Sims | 243.6 | 0.7 | 40.9 | 180 | 175 | 68.6 |
| Justin Steele | 598.7 | 0.4 | 40.8 | 245 | 176 | 422.7 |
| Paolo Espino | 599.2 | 0.2 | 40.7 | 327 | 177 | 422.2 |
| Edwin Díaz | 63.5 | 1.1 | 40.6 | 135 | 178 | -114.5 |
| Aaron Bummer | 595.4 | 1.0 | 40.6 | 137 | 179 | 416.4 |
| Chad Green | 403.2 | 1.0 | 40.4 | 147 | 180 | 223.2 |
| Seth Lugo | 598.1 | 0.5 | 40.3 | 236 | 181 | 417.1 |
| Chris Stratton | 519.4 | 0.3 | 40.2 | 284 | 182 | 337.4 |
| Tanner Scott | 599.0 | 0.7 | 40.0 | 182 | 183 | 416 |
| David Price | 581.5 | 0.5 | 39.4 | 216 | 184 | 397.5 |
| Zach Thompson | 546.8 | 1.0 | 39.3 | 146 | 185 | 361.8 |
| Mike Mayers | 594.7 | 0.5 | 39.2 | 221 | 186 | 408.7 |
| Griffin Canning | 594.2 | 0.5 | 39.1 | 213 | 187 | 407.2 |
| Sam Hentges | 999.0 | 0.3 | 38.9 | 306 | 188 | 811 |
| Devin Williams | 287.1 | 1.1 | 38.9 | 133 | 189 | 98.1 |
| Heath Hembree | 600.8 | 0.2 | 38.7 | 338 | 190 | 410.8 |
| Mychal Givens | 549.7 | 0.1 | 38.5 | 419 | 191 | 358.7 |
| Alex Colomé | 378.0 | 0.1 | 38.5 | 387 | 192 | 186 |
| Domingo Germán | 494.6 | 1.1 | 38.4 | 132 | 193 | 301.6 |
| Raisel Iglesias | 47.6 | 1.0 | 38.1 | 138 | 194 | -146.4 |
| Brad Boxberger | 588.3 | 0.2 | 38.1 | 331 | 195 | 393.3 |
| Brent Suter | 587.6 | 0.4 | 38.1 | 268 | 196 | 391.6 |
| Giovanny Gallegos | 113.5 | 1.0 | 38.1 | 140 | 197 | -83.5 |
| Daniel Norris | 999.0 | 0.1 | 38.0 | 398 | 198 | 801 |
| Hansel Robles | 596.0 | -0.1 | 37.9 | 505 | 199 | 397 |
| Austin Voth | 600.4 | 0.0 | 37.8 | 420 | 200 | 400.4 |
| Jaime Barría | 999.0 | 0.5 | 37.7 | 238 | 201 | 798 |
| Jordan Romano | 88.3 | 0.8 | 37.7 | 160 | 202 | -113.7 |
| Kyle Funkhouser | 999.0 | 0.1 | 37.6 | 386 | 203 | 796 |
| Will Smith | 120.8 | 0.5 | 37.4 | 227 | 204 | -83.2 |
| Taylor Rogers | 172.5 | 1.1 | 37.3 | 127 | 205 | -32.5 |
| Yimi García | 581.1 | 0.2 | 37.3 | 352 | 206 | 375.1 |
| Tyler Wells | 454.6 | 0.9 | 37.2 | 153 | 207 | 247.6 |
| Camilo Doval | 159.8 | 0.5 | 37.1 | 229 | 208 | -48.2 |
| Justin Dunn | 577.6 | 0.6 | 37.1 | 198 | 209 | 368.6 |
| Joe Jiménez | 999.0 | 0.2 | 36.9 | 329 | 210 | 789 |
| Garrett Richards | 598.7 | 0.4 | 36.8 | 272 | 211 | 387.7 |
| Hirokazu Sawamura | 589.6 | 0.0 | 36.6 | 443 | 212 | 377.6 |
| Jake Odorizzi | 533.5 | 0.8 | 36.6 | 163 | 213 | 320.5 |
| Ryne Stanek | 598.8 | 0.2 | 36.6 | 330 | 214 | 384.8 |
| J.C. Mejía | 999.0 | 0.1 | 36.4 | 389 | 215 | 784 |
| Bryan Shaw | 597.8 | -0.1 | 36.3 | 507 | 216 | 381.8 |
| Luis Gil | 525.9 | 0.8 | 36.2 | 165 | 217 | 308.9 |
| Rowan Wick | 302.0 | 0.4 | 36.2 | 249 | 218 | 84 |
| Rafael Dolis | 999.0 | 0.0 | 36.1 | 434 | 219 | 780 |
| Tyler Alexander | 586.8 | 0.8 | 36.1 | 167 | 220 | 366.8 |
| Liam Hendriks | 32.0 | 1.6 | 36.1 | 92 | 221 | -189 |
| J.B. Wendelken | 600.1 | 0.2 | 36.0 | 341 | 222 | 378.1 |
| Brad Hand | 520.0 | 0.2 | 35.9 | 351 | 223 | 297 |
| Jeurys Familia | 597.5 | 0.3 | 35.8 | 305 | 224 | 373.5 |
| Trevor Stephan | 999.0 | 0.1 | 35.8 | 368 | 225 | 774 |
| Michael Fulmer | 347.0 | 0.6 | 35.8 | 192 | 226 | 121 |
| Griffin Jax | 600.8 | 0.1 | 35.7 | 363 | 227 | 373.8 |
| Ryan Pressly | 64.3 | 1.0 | 35.5 | 136 | 228 | -163.7 |
| Jackson Kowar | 594.6 | 0.5 | 35.5 | 233 | 229 | 365.6 |
| José Ureña | 999.0 | 0.0 | 35.3 | 467 | 230 | 769 |
| Blake Treinen | 147.6 | 0.8 | 35.3 | 166 | 231 | -83.4 |
| José Cisnero | 999.0 | 0.3 | 35.2 | 281 | 232 | 767 |
| Cole Sulser | 427.2 | 0.7 | 35.1 | 177 | 233 | 194.2 |
| Joely Rodríguez | 600.9 | 0.5 | 35.1 | 214 | 234 | 366.9 |
| Lou Trivino | 237.2 | 0.2 | 35.1 | 316 | 235 | 2.2 |
| Jeff Hoffman | 999.0 | 0.0 | 35.0 | 426 | 236 | 763 |
| Craig Kimbrel | 164.4 | 0.7 | 34.9 | 176 | 237 | -72.6 |
| Jake Brentz | 600.8 | 0.3 | 34.8 | 301 | 238 | 362.8 |
| Diego Castillo | 409.2 | 0.6 | 34.7 | 191 | 239 | 170.2 |
| Caleb Smith | 592.3 | 0.0 | 34.6 | 441 | 240 | 352.3 |
| Josh Sborz | 600.9 | 0.3 | 34.5 | 290 | 241 | 359.9 |
| Chad Kuhl | 600.7 | 0.2 | 34.4 | 349 | 242 | 358.7 |
| Dean Kremer | 999.0 | 0.6 | 34.4 | 210 | 243 | 756 |
| Keegan Thompson | 999.0 | 0.0 | 34.4 | 429 | 244 | 755 |
| Matt Wisler | 599.6 | 0.4 | 34.3 | 270 | 245 | 354.6 |
| Ryan Helsley | 600.2 | 0.1 | 34.3 | 380 | 246 | 354.2 |
| Anthony Banda | 999.0 | -0.1 | 34.2 | 514 | 247 | 752 |
| Andrew Kittredge | 257.6 | 0.8 | 34.1 | 169 | 248 | 9.6 |
| Garrett Crochet | 549.8 | 0.9 | 34.0 | 154 | 249 | 300.8 |
| Tony Santillan | 600.3 | 0.4 | 34.0 | 263 | 250 | 350.3 |
| Eli Morgan | 600.5 | 0.2 | 33.9 | 346 | 251 | 349.5 |
| Kenley Jansen | 90.0 | 0.6 | 33.7 | 187 | 252 | -162 |
| Andrew Wantz | 999.0 | 0.4 | 33.7 | 262 | 253 | 746 |
| Spencer Patton | 599.0 | 0.1 | 33.4 | 410 | 254 | 345 |
| Josh Hader | 30.5 | 1.3 | 33.3 | 115 | 255 | -224.5 |
| Ross Stripling | 583.7 | 0.4 | 33.2 | 256 | 256 | 327.7 |
| Trevor Richards | 600.5 | 0.4 | 33.1 | 261 | 257 | 343.5 |
| Génesis Cabrera | 596.6 | 0.6 | 32.9 | 205 | 258 | 338.6 |
| Robert Stephenson | 590.5 | 0.0 | 32.8 | 466 | 259 | 331.5 |
| Vince Velasquez | 599.4 | 0.3 | 32.8 | 308 | 260 | 339.4 |
| Matthew Boyd | 591.0 | 1.0 | 32.4 | 144 | 261 | 330 |
| Josh Staumont | 505.8 | 0.4 | 32.4 | 244 | 262 | 243.8 |
| Adbert Alzolay | 444.0 | 0.7 | 32.4 | 179 | 263 | 181 |
| Corey Knebel | 151.3 | 0.6 | 32.4 | 204 | 264 | -112.7 |
| Anthony Misiewicz | 999.0 | 0.3 | 32.3 | 295 | 265 | 734 |
| Dylan Floro | 221.3 | 0.4 | 32.3 | 251 | 266 | -44.7 |
| A.J. Minter | 596.5 | 0.5 | 32.3 | 215 | 267 | 329.5 |
| Dinelson Lamet | 400.5 | 0.9 | 32.2 | 157 | 268 | 132.5 |
| Brusdar Graterol | 567.4 | 0.7 | 32.0 | 178 | 269 | 298.4 |
| Sam Howard | 999.0 | 0.3 | 32.0 | 310 | 270 | 729 |
| Sam Coonrod | 600.9 | 0.4 | 31.8 | 271 | 271 | 329.9 |
| Jonathan Loáisiga | 393.7 | 1.0 | 31.8 | 142 | 272 | 121.7 |
| Clay Holmes | 600.3 | 0.6 | 31.7 | 201 | 273 | 327.3 |
| Mark Melancon | 131.8 | 0.4 | 31.6 | 247 | 274 | -142.2 |
| Jakob Junis | 600.1 | 0.2 | 31.6 | 332 | 275 | 325.1 |
| Archie Bradley | 600.7 | 0.2 | 31.5 | 348 | 276 | 324.7 |
| Tyler Duffey | 577.8 | 0.4 | 31.3 | 260 | 277 | 300.8 |
| Brett Anderson | 600.9 | 0.6 | 31.3 | 190 | 278 | 322.9 |
| Anthony Bender | 412.8 | 0.7 | 31.1 | 184 | 279 | 133.8 |
| Tim Mayza | 599.3 | 0.6 | 31.1 | 197 | 280 | 319.3 |
| Wandy Peralta | 999.0 | 0.2 | 30.9 | 340 | 281 | 718 |
| Ian Kennedy | 409.8 | -0.1 | 30.9 | 512 | 282 | 127.8 |
| David Bednar | 190.8 | 0.9 | 30.8 | 156 | 283 | -92.2 |
| Caleb Thielbar | 999.0 | 0.6 | 30.8 | 195 | 284 | 715 |
| Emmanuel Clase | 57.5 | 1.3 | 30.7 | 112 | 285 | -227.5 |
| Jorge Alcala | 459.0 | 0.6 | 30.7 | 209 | 286 | 173 |
| Collin McHugh | 545.5 | 0.6 | 30.4 | 189 | 287 | 258.5 |
| Trevor Williams | 601.0 | 0.2 | 30.4 | 326 | 288 | 313 |
| Emilio Pagán | 551.7 | 0.1 | 30.4 | 364 | 289 | 262.7 |
| Patrick Murphy | 999.0 | 0.4 | 30.2 | 254 | 290 | 709 |
| J.P. Feyereisen | 597.6 | 0.0 | 30.1 | 487 | 291 | 306.6 |
| Kendall Graveman | 531.0 | 0.4 | 30.1 | 240 | 292 | 239 |
| Jorge López | 599.9 | 0.4 | 30.0 | 265 | 293 | 306.9 |
| Tyler Rogers | 534.3 | 0.5 | 30.0 | 230 | 294 | 240.3 |
| Kyle Zimmer | 999.0 | -0.2 | 29.7 | 528 | 295 | 704 |
| Max Kranick | 999.0 | 0.3 | 29.6 | 297 | 296 | 703 |
| Joe Ross | 599.6 | 0.7 | 29.4 | 170 | 297 | 302.6 |
| Sergio Romo | 600.5 | 0.0 | 29.4 | 450 | 298 | 302.5 |
| Brooks Raley | 600.9 | 0.3 | 29.3 | 296 | 299 | 301.9 |
| Daniel Hudson | 550.2 | 0.5 | 29.1 | 225 | 300 | 250.2 |
| Pierce Johnson | 433.3 | 0.6 | 29.1 | 200 | 301 | 132.3 |
| Ryan Hendrix | 999.0 | 0.0 | 29.1 | 468 | 302 | 697 |
| Tim Hill | 999.0 | 0.3 | 29.0 | 293 | 303 | 696 |
| Nick Wittgren | 999.0 | 0.1 | 29.0 | 374 | 304 | 695 |
| Mike Foltynewicz | 600.4 | -0.1 | 28.9 | 511 | 305 | 295.4 |
| Cionel Pérez | 999.0 | 0.5 | 28.8 | 235 | 306 | 693 |
| Lucas Gilbreath | 596.4 | 0.0 | 28.7 | 471 | 307 | 289.4 |
| Kwang Hyun Kim | 593.0 | 0.3 | 28.6 | 303 | 308 | 285 |
| Taylor Widener | 600.5 | 0.0 | 28.4 | 424 | 309 | 291.5 |
| Alex Vesia | 600.0 | 0.3 | 28.4 | 289 | 310 | 290 |
| Sean Doolittle | 598.5 | 0.1 | 28.3 | 402 | 311 | 287.5 |
| Rafael Montero | 999.0 | 0.1 | 28.0 | 394 | 312 | 687 |
| Connor Brogdon | 600.4 | 0.4 | 27.9 | 252 | 313 | 287.4 |
| Jake Woodford | 599.5 | 0.0 | 27.9 | 460 | 314 | 285.5 |
| John King | 999.0 | 0.5 | 27.7 | 219 | 315 | 684 |
| Josh Taylor | 999.0 | 0.5 | 27.7 | 217 | 316 | 683 |
| José Quijada | 999.0 | 0.5 | 27.6 | 212 | 317 | 682 |
| Josh Tomlin | 999.0 | -0.3 | 27.4 | 532 | 318 | 681 |
| Sean Newcomb | 600.1 | 0.1 | 27.3 | 400 | 319 | 281.1 |
| Tyler Kinley | 999.0 | 0.0 | 27.3 | 433 | 320 | 679 |
| Justin Wilson | 999.0 | 0.1 | 27.0 | 377 | 321 | 678 |
| Luke Jackson | 598.1 | 0.4 | 27.0 | 267 | 322 | 276.1 |
| Bailey Falter | 599.8 | 0.5 | 27.0 | 218 | 323 | 276.8 |
| Jake McGee | 251.1 | 0.4 | 27.0 | 241 | 324 | -72.9 |
| Darwinzon Hernandez | 999.0 | 0.2 | 27.0 | 324 | 325 | 674 |
| Tucker Davidson | 584.9 | 0.6 | 26.9 | 193 | 326 | 258.9 |
| Duane Underwood Jr. | 999.0 | 0.1 | 26.8 | 417 | 327 | 672 |
| Ryan Tepera | 581.3 | 0.4 | 26.7 | 246 | 328 | 253.3 |
| Matt Harvey | 999.0 | 0.3 | 26.7 | 286 | 329 | 670 |
| Miguel Castro | 600.9 | 0.1 | 26.6 | 372 | 330 | 270.9 |
| Johan Oviedo | 999.0 | 0.2 | 26.3 | 356 | 331 | 668 |
| Austin Warren | 999.0 | 0.5 | 26.2 | 231 | 332 | 667 |
| Joe Smith | 999.0 | 0.1 | 26.1 | 361 | 333 | 666 |
| Bryan Garcia | 999.0 | -0.2 | 26.1 | 530 | 334 | 665 |
| Nick Sandlin | 599.8 | 0.6 | 26.1 | 207 | 335 | 264.8 |
| Drew Steckenrider | 401.5 | 0.3 | 26.1 | 294 | 336 | 65.5 |
| Alexander Wells | 999.0 | 0.4 | 25.9 | 276 | 337 | 662 |
| Alex Lange | 999.0 | 0.1 | 25.8 | 366 | 338 | 661 |
| Sean Reid-Foley | 999.0 | 0.0 | 25.7 | 438 | 339 | 660 |
| Jake Cousins | 600.1 | 0.5 | 25.6 | 226 | 340 | 260.1 |
| Zach Pop | 999.0 | 0.2 | 25.5 | 345 | 341 | 658 |
| Logan Allen | 600.8 | 0.1 | 25.3 | 418 | 342 | 258.8 |
| Sammy Long | 596.1 | 0.3 | 25.2 | 307 | 343 | 253.1 |
| Demarcus Evans | 600.6 | 0.0 | 25.2 | 472 | 344 | 256.6 |
| Zack Littell | 599.3 | 0.1 | 25.2 | 367 | 345 | 254.3 |
| JT Chargois | 600.3 | 0.3 | 25.2 | 279 | 346 | 254.3 |
| Reynaldo López | 557.4 | 0.4 | 25.2 | 257 | 347 | 210.4 |
| Jarlín García | 600.8 | 0.1 | 25.1 | 411 | 348 | 252.8 |
| Dillon Tate | 600.9 | 0.4 | 25.0 | 264 | 349 | 251.9 |
| Matt Shoemaker | 999.0 | 0.0 | 25.0 | 457 | 350 | 649 |
| Steven Brault | 600.9 | 0.0 | 24.8 | 432 | 351 | 249.9 |
| Yency Almonte | 999.0 | -0.2 | 24.8 | 524 | 352 | 647 |
| Brett Martin | 999.0 | 0.4 | 24.7 | 275 | 353 | 646 |
| Craig Stammen | 599.4 | 0.4 | 24.7 | 259 | 354 | 245.4 |
| Anthony Bass | 600.9 | 0.1 | 24.6 | 412 | 355 | 245.9 |
| Albert Abreu | 999.0 | -0.1 | 24.6 | 496 | 356 | 643 |
| Jeffrey Springs | 999.0 | 0.3 | 24.5 | 287 | 357 | 642 |
| Austin Davis | 999.0 | 0.2 | 24.5 | 320 | 358 | 641 |
| Phil Bickford | 600.9 | 0.4 | 24.5 | 258 | 359 | 241.9 |
| Tyler Clippard | 600.8 | -0.2 | 24.5 | 529 | 360 | 240.8 |
| Blake Taylor | 999.0 | 0.2 | 24.4 | 353 | 361 | 638 |
| Greg Holland | 600.2 | 0.0 | 24.4 | 473 | 362 | 238.2 |
| Dominic Leone | 600.8 | 0.1 | 24.2 | 371 | 363 | 237.8 |
| Alex Reyes | 395.7 | 0.2 | 24.0 | 335 | 364 | 31.7 |
| Ryan Weathers | 600.8 | 0.3 | 24.0 | 300 | 365 | 235.8 |
| Austin Adams | 600.0 | 0.4 | 24.0 | 273 | 366 | 234 |
| Josh Rogers | 600.8 | -0.1 | 23.8 | 506 | 367 | 233.8 |
| Art Warren | 482.3 | 0.8 | 23.7 | 161 | 368 | 114.3 |
| Yusmeiro Petit | 601.0 | -0.3 | 23.6 | 531 | 369 | 232 |
| Sean Poppen | 999.0 | 0.2 | 23.6 | 319 | 370 | 629 |
| Tyler Gilbert | 599.9 | 0.4 | 23.5 | 255 | 371 | 228.9 |
| Adam Morgan | 999.0 | 0.1 | 23.4 | 391 | 372 | 627 |
| Michael King | 600.5 | 0.5 | 23.4 | 220 | 373 | 227.5 |
| Chris Martin | 600.3 | 0.4 | 23.4 | 277 | 374 | 226.3 |
| Dennis Santana | 999.0 | 0.0 | 23.1 | 442 | 375 | 624 |
| Jacob Webb | 999.0 | 0.0 | 23.1 | 447 | 376 | 623 |
| Joe Kelly | 600.4 | 0.4 | 23.1 | 243 | 377 | 223.4 |
| Wander Suero | 999.0 | 0.2 | 23.0 | 317 | 378 | 621 |
| Chi Chi González | 999.0 | 0.1 | 22.9 | 396 | 379 | 620 |
| Brad Brach | 999.0 | 0.0 | 22.8 | 436 | 380 | 619 |
| Kolby Allard | 601.0 | 0.2 | 22.7 | 312 | 381 | 220 |
| Brandon Kintzler | 999.0 | 0.0 | 22.6 | 464 | 382 | 617 |
| Trevor Bauer | 208.4 | 1.0 | 22.5 | 141 | 383 | -174.6 |
| Richard Rodríguez | 556.5 | 0.1 | 22.5 | 414 | 384 | 172.5 |
| Touki Toussaint | 598.6 | 0.1 | 22.3 | 369 | 385 | 213.6 |
| Ralph Garza Jr. | 999.0 | -0.1 | 22.1 | 501 | 386 | 613 |
| Jhoulys Chacín | 999.0 | -0.4 | 22.0 | 533 | 387 | 612 |
| Tyler Matzek | 573.2 | 0.7 | 22.0 | 181 | 388 | 185.2 |
| Ryan Thompson | 999.0 | 0.3 | 22.0 | 278 | 389 | 610 |
| Luis Cessa | 588.8 | 0.3 | 21.8 | 299 | 390 | 198.8 |
| Andrew Miller | 999.0 | -0.1 | 21.8 | 510 | 391 | 608 |
| Ross Detwiler | 999.0 | 0.0 | 21.7 | 474 | 392 | 607 |
| Taylor Clarke | 999.0 | 0.0 | 21.5 | 437 | 393 | 606 |
| Deolis Guerra | 599.9 | 0.2 | 21.3 | 342 | 394 | 205.9 |
| Ben Bowden | 999.0 | 0.1 | 21.3 | 395 | 395 | 604 |
| Joe Barlow | 216.1 | 0.2 | 21.1 | 344 | 396 | -179.9 |
| Andrew Chafin | 587.9 | 0.6 | 20.8 | 206 | 397 | 190.9 |
| Nick Mears | 999.0 | 0.0 | 20.8 | 480 | 398 | 601 |
| Kyle Muller | 590.8 | 0.3 | 20.8 | 302 | 399 | 191.8 |
| Packy Naughton | 999.0 | 0.3 | 20.7 | 292 | 400 | 599 |
| Drew Smith | 999.0 | 0.1 | 20.7 | 360 | 401 | 598 |
| Michael Rucker | 999.0 | 0.0 | 20.4 | 427 | 402 | 597 |
| Derek Holland | 999.0 | -0.1 | 20.3 | 498 | 403 | 596 |
| Andres Machado | 999.0 | -0.1 | 20.2 | 513 | 404 | 595 |
| Trevor Cahill | 999.0 | 0.2 | 20.2 | 313 | 405 | 594 |
| Cody Ponce | 999.0 | 0.2 | 20.2 | 354 | 406 | 593 |
| Aaron Loup | 594.4 | 0.4 | 20.0 | 274 | 407 | 187.4 |
| Carlos Martínez | 596.0 | 0.0 | 19.9 | 477 | 408 | 188 |
| Jake Arrieta | 999.0 | 0.0 | 19.8 | 481 | 409 | 590 |
| Junior Guerra | 999.0 | 0.0 | 19.8 | 490 | 410 | 589 |
| Bryan Abreu | 999.0 | 0.1 | 19.7 | 392 | 411 | 588 |
| Charlie Barnes | 999.0 | 0.2 | 19.7 | 339 | 412 | 587 |
| Jay Jackson | 999.0 | 0.2 | 19.7 | 347 | 413 | 586 |
| Brandon Bielak | 999.0 | 0.1 | 19.6 | 362 | 414 | 585 |
| Nabil Crismatt | 999.0 | 0.1 | 19.5 | 403 | 415 | 584 |
| Wily Peralta | 600.8 | 0.1 | 19.3 | 388 | 416 | 184.8 |
| Tony Watson | 999.0 | 0.0 | 19.2 | 479 | 417 | 582 |
| Steve Cishek | 595.3 | 0.0 | 19.1 | 439 | 418 | 177.3 |
| Jesse Chavez | 999.0 | 0.1 | 18.6 | 408 | 419 | 580 |
| Mitch White | 562.7 | 0.3 | 18.6 | 288 | 420 | 142.7 |
| Steven Okert | 999.0 | 0.3 | 18.6 | 304 | 421 | 578 |
| Adam Cimber | 600.6 | 0.3 | 18.6 | 282 | 422 | 178.6 |
| Alex Claudio | 999.0 | 0.0 | 18.5 | 463 | 423 | 576 |
| Joe Mantiply | 999.0 | 0.2 | 18.3 | 334 | 424 | 575 |
| Trent Thornton | 999.0 | 0.1 | 18.2 | 401 | 425 | 574 |
| Buck Farmer | 999.0 | -0.2 | 18.2 | 519 | 426 | 573 |
| Matt Foster | 999.0 | 0.1 | 18.1 | 373 | 427 | 572 |
| Domingo Tapia | 999.0 | 0.0 | 18.0 | 492 | 428 | 571 |
| Jharel Cotton | 600.8 | 0.1 | 18.0 | 376 | 429 | 171.8 |
| Ryan Borucki | 999.0 | 0.2 | 17.8 | 355 | 430 | 569 |
| Hunter Strickland | 599.9 | 0.0 | 17.6 | 423 | 431 | 168.9 |
| Paul Blackburn | 999.0 | 0.2 | 17.4 | 315 | 432 | 567 |
| Erik Swanson | 600.1 | 0.3 | 17.1 | 291 | 433 | 167.1 |
| Jordan Holloway | 999.0 | -0.1 | 17.1 | 502 | 434 | 565 |
| Ervin Santana | 999.0 | -0.5 | 17.0 | 534 | 435 | 564 |
| Blake Parker | 999.0 | 0.1 | 17.0 | 393 | 436 | 563 |
| Jacob Barnes | 999.0 | 0.1 | 16.8 | 381 | 437 | 562 |
| Brandon Workman | 999.0 | -0.2 | 16.7 | 521 | 438 | 561 |
| Mason Thompson | 999.0 | 0.0 | 16.7 | 448 | 439 | 560 |
| Paul Campbell | 999.0 | -0.1 | 16.6 | 495 | 440 | 559 |
| Michael Feliz | 999.0 | 0.0 | 16.4 | 425 | 441 | 558 |
| Tyler Zuber | 999.0 | 0.0 | 16.3 | 462 | 442 | 557 |
| Noé Ramirez | 600.9 | 0.1 | 16.2 | 390 | 443 | 157.9 |
| Justin Garza | 999.0 | -0.2 | 16.2 | 523 | 444 | 555 |
| Yohan Ramirez | 599.8 | 0.1 | 16.1 | 413 | 445 | 154.8 |
| Miguel Sánchez | 999.0 | 0.0 | 16.0 | 484 | 446 | 553 |
| Kyle McGowin | 999.0 | 0.2 | 15.9 | 321 | 447 | 552 |
| Héctor Santiago | 999.0 | -0.2 | 15.6 | 525 | 448 | 551 |
| Ryan Burr | 999.0 | 0.0 | 15.5 | 440 | 449 | 550 |
| Humberto Castellanos | 999.0 | 0.1 | 15.5 | 370 | 450 | 549 |
| JD Hammer | 999.0 | 0.0 | 15.4 | 483 | 451 | 548 |
| Phillips Valdez | 999.0 | -0.1 | 15.1 | 493 | 452 | 547 |
| Kodi Whitley | 999.0 | 0.2 | 14.4 | 350 | 453 | 546 |
| Junior Fernández | 999.0 | 0.1 | 14.3 | 415 | 454 | 545 |
| Danny Coulombe | 999.0 | 0.1 | 14.3 | 382 | 455 | 544 |
| Drew Pomeranz | 582.3 | 0.1 | 14.2 | 383 | 456 | 126.3 |
| Rex Brothers | 999.0 | 0.0 | 14.1 | 461 | 457 | 542 |
| Juan Minaya | 999.0 | 0.1 | 14.0 | 397 | 458 | 541 |
| Anthony Castro | 999.0 | 0.1 | 13.6 | 375 | 459 | 540 |
| Tommy Nance | 999.0 | 0.1 | 13.5 | 385 | 460 | 539 |
| T.J. McFarland | 999.0 | 0.1 | 13.3 | 378 | 461 | 538 |
| Joel Payamps | 999.0 | 0.3 | 13.2 | 285 | 462 | 537 |
| Richard Bleier | 600.7 | 0.5 | 13.1 | 224 | 463 | 137.7 |
| Sean Guenther | 999.0 | 0.2 | 13.0 | 323 | 464 | 535 |
| Nick Neidert | 999.0 | -0.2 | 13.0 | 517 | 465 | 534 |
| Ryne Harper | 999.0 | 0.1 | 13.0 | 416 | 466 | 533 |
| Anthony Kay | 999.0 | 0.0 | 12.8 | 422 | 467 | 532 |
| Hoby Milner | 999.0 | 0.1 | 12.8 | 405 | 468 | 531 |
| Matt Moore | 999.0 | -0.1 | 12.8 | 503 | 469 | 530 |
| Dan Winkler | 999.0 | -0.1 | 12.7 | 515 | 470 | 529 |
| Humberto Mejía | 999.0 | 0.0 | 12.7 | 453 | 471 | 528 |
| Matt Peacock | 999.0 | 0.0 | 12.5 | 488 | 472 | 527 |
| Brett de Geus | 999.0 | 0.0 | 12.4 | 475 | 473 | 526 |
| Aaron Sanchez | 600.6 | 0.0 | 12.4 | 451 | 474 | 126.6 |
| José Ruiz | 999.0 | 0.2 | 12.4 | 357 | 475 | 524 |
| Wade LeBlanc | 999.0 | 0.2 | 12.4 | 358 | 476 | 523 |
| Chris Mazza | 999.0 | 0.3 | 12.3 | 309 | 477 | 522 |
| Shane Greene | 999.0 | -0.1 | 12.1 | 499 | 478 | 521 |
| Keynan Middleton | 999.0 | -0.1 | 12.1 | 500 | 479 | 520 |
| Danny Duffy | 587.9 | 0.2 | 12.0 | 318 | 480 | 107.9 |
| Cody Poteet | 999.0 | 0.1 | 11.7 | 407 | 481 | 518 |
| José Álvarez | 600.2 | 0.2 | 11.7 | 359 | 482 | 118.2 |
| Chase Anderson | 999.0 | 0.0 | 11.6 | 491 | 483 | 516 |
| Louis Head | 600.5 | 0.3 | 11.6 | 311 | 484 | 116.5 |
| Kohei Arihara | 999.0 | 0.0 | 11.5 | 449 | 485 | 514 |
| Jordan Sheffield | 999.0 | -0.2 | 11.4 | 520 | 486 | 513 |
| Ashton Goudeau | 999.0 | -0.2 | 10.6 | 527 | 487 | 512 |
| Victor González | 999.0 | 0.1 | 10.6 | 365 | 488 | 511 |
| Yennsy Díaz | 999.0 | -0.1 | 10.4 | 516 | 489 | 510 |
| Braxton Garrett | 999.0 | 0.0 | 10.4 | 428 | 490 | 509 |
| Chasen Shreve | 999.0 | -0.1 | 10.1 | 509 | 491 | 508 |
| César Valdez | 600.3 | 0.0 | 10.0 | 431 | 492 | 108.3 |
| Sam Selman | 999.0 | 0.0 | 9.5 | 444 | 493 | 506 |
| Cam Bedrosian | 999.0 | 0.0 | 9.1 | 458 | 494 | 505 |
| Dustin May | 559.5 | 0.3 | 9.1 | 298 | 495 | 64.5 |
| Sam Clay | 999.0 | 0.0 | 8.7 | 430 | 496 | 503 |
| Caleb Baragar | 999.0 | -0.2 | 8.7 | 522 | 497 | 502 |
| Dillon Maples | 999.0 | 0.1 | 8.6 | 409 | 498 | 501 |
| Daniel Ponce de Leon | 600.7 | 0.0 | 8.5 | 456 | 499 | 101.7 |
| Erasmo Ramírez | 999.0 | -0.2 | 7.9 | 526 | 500 | 499 |
| Edwin Uceta | 999.0 | 0.0 | 7.9 | 455 | 501 | 498 |
| Miguel Diaz | 999.0 | -0.1 | 7.8 | 494 | 502 | 497 |
| Marcos Diplán | 999.0 | 0.0 | 7.4 | 459 | 503 | 496 |
| Enyel De Los Santos | 999.0 | 0.1 | 7.3 | 404 | 504 | 495 |
| Tayler Saucedo | 999.0 | 0.1 | 7.1 | 399 | 505 | 494 |
| Robert Gsellman | 999.0 | -0.2 | 7.1 | 518 | 506 | 493 |
| Wes Benjamin | 999.0 | -0.1 | 7.0 | 504 | 507 | 492 |
| Thomas Eshelman | 999.0 | 0.0 | 6.8 | 476 | 508 | 491 |
| Reiss Knehr | 999.0 | 0.0 | 6.7 | 446 | 509 | 490 |
| Jake Faria | 999.0 | 0.0 | 6.7 | 469 | 510 | 489 |
| Luke Farrell | 999.0 | 0.0 | 6.3 | 489 | 511 | 488 |
| Robert Dugger | 999.0 | 0.0 | 6.2 | 478 | 512 | 487 |
| Casey Sadler | 600.5 | 0.3 | 5.8 | 280 | 513 | 87.5 |
| Shaun Anderson | 999.0 | 0.0 | 5.7 | 445 | 514 | 485 |
| Drew Hutchison | 999.0 | -0.1 | 5.5 | 497 | 515 | 484 |
| Sean Nolin | 999.0 | 0.0 | 5.3 | 485 | 516 | 483 |
| Adrian Sampson | 999.0 | -0.1 | 4.9 | 508 | 517 | 482 |
| Tyler Glasnow | 598.8 | 0.2 | 4.9 | 314 | 518 | 80.8 |
| Shawn Armstrong | 999.0 | 0.0 | 4.6 | 454 | 519 | 480 |
| Kevin Ginkel | 999.0 | 0.0 | 4.3 | 470 | 520 | 479 |
| Carson Fulmer | 999.0 | 0.0 | 4.1 | 482 | 521 | 478 |
| Edgar Santana | 999.0 | 0.0 | 4.1 | 452 | 522 | 477 |
| Kyle Crick | 999.0 | 0.0 | 4.0 | 465 | 523 | 476 |
| Daniel Castano | 999.0 | 0.0 | 3.7 | 486 | 524 | 475 |
| Kenta Maeda | 600.7 | 0.1 | 3.3 | 384 | 525 | 75.7 |
| John Gant | 600.8 | 0.0 | 3.3 | 435 | 526 | 74.8 |
| Chris Ellis | 600.8 | 0.1 | 2.5 | 406 | 527 | 73.8 |
NA
NA
NA
NA
NA
NA