In part 1 I looked at building a neural network model of a batter strike zone in R. In part 2 I talked about using that model to estimate the top and bottom of that batter’s individual strike zone. At long last, this post will use that information to model an umpire strike zone, which was the whole point all along!
As in part 1, let’s assume we have a data frame (dfu) preloaded, with horizontal pitch location in column px, vertical pitch location in column pz, and the pitch outcome (in typical Gameday notation) in column des. But this time, rather than being a data frame of all the pitches for a particular hitter (Mike Trout) in 2014, they will be all the pitches for a particular umpire. Let’s also assume we applied the method in part 2 to derive a personalized strike zone top and bottom for each hitter in MLB, and that these values are also in dfu (columns sz_top and sz_bot).
Like before, we only care about called pitches, so we will subset df accordingly:
> dfu_c <- df[df$des==‘Called Strike’ || df$des==‘Ball’ || df$des==‘Ball in Dirt’,]
And once again, we’ll convert the ball/strike call into binary data for training the neural network:
> dfu_c$call <- ifelse(dfu_c$des==‘Called Strike’,1,0)
We need to do one additional thing to normalize the strike zone dimensions across batters:
> dfu_c$sz_mid <- (dfu_c$sz_bot + dfu_c$sz_top)/2 > dfu_c$pz.ratio <- (dfu_c$pz-dfu_c$sz_mid)/((dfu_c$sz_top-dfu_c$sz_bot)/2)
Now we have the vertical location of each pitch stated as a ratio normalized to the size of the batter’s individual strike zone.
Finally, we can train a neural network model of the umpire’s strike zone. Once again, the neuralnet package makes this step trivial:
> m_u <- neuralnet(call~px+pz.ratio,data=dfu_c,hidden=4,linear.output=FALSE)
That’s it! Now we have a model (m_u) of an umpire strike zone. Really easy, right?
Now that we’ve covered how to build a neural network model of an umpire strike zone in R, we can start to do some cool stuff… next time!Follow @PeterKBonney