New Batted Ball Classes

I have been working on a 3D batted ball classification system. That is, classifying batted balls based upon their exit velocity, vertical angle, and horizontal angle. I have identified a few patterns, and I am pretty excited with the advances I have made over the past week. However, I realized this project could take many weeks or months to complete and I wanted to push an update sooner than that. So, I have put the 2D solution onto xStats and pushed the update to the stats spreadsheet. This 2D solution is not groundbreaking work in terms of Statcast research, but it is a significant improvement over the classification system that was previously published on xStats.

I can best sum up this classification system using an image. The specific definitions of these fields are far too complicated to be interesting to anyone, but the image should sum it up perfectly adequately. In essence, the groups are defined by the types of hits a ball in play is capable of generating (single, double, home run, out) and the average value of similar balls in play.

In the tables below you'll see the hit totals, sample sizes (n), batting average, and wOBA for these classes of batted ball. Notice how each class has very different ratios of singles, doubles, home runs, and outs.

 
Batted Ball Hit Totals
BB Type 1B 2B 3B HR AVG wOBA n % of BIP
DB 12,417 826 19 2 .124 .112 106,678 26.3%
GB 27,015 2,483 166 2 .369 .337 80,492 19.8%
LD 41,191 9,440 831 10 .759 .727 67,804 16.7%
HD 1,124 9,039 1,126 15,689 .683 1.156 39,523 9.7%
FB 3,962 3,544 470 1,669 .235 .292 41,131 10.1%
PU 589 742 99 71 .021 .025 70,110 17.3%
 

The four home runs in the DB and GB classes are measurement errors by Statcast. There are likely more errors in the dataset, but not enough to really worry about. The majority of extra base hits live in the HD class of batted ball, while the LD class contains the majority of singles. The remainder of the hits are distributed amongst the GB and FB classes, while DB and PU are generally outs.

 
Year to Year Correlations
Years DB GB LD HD FB PU
2015-2016 .714 .365 .122 .709 .350 .610
2015-2017 .729 .369 .064 .748 .241 .649
2016-2017 .729 .381 .192 .708 .358 .679
2015-2018 .480 .027 .138 .378 .146 .384
2016-2018 .460 .153 .042 .407 .144 .372
2017-2018 .469 .186 .069 .524 .113 .416
 
 
BIP Result Correlations
Result DB GB LD HD FB PU
Sac -.221 -.069 .035 .180 .144 .159
Hit -.516 .104 .544 .502 .027 -.052
1B -.119 .330 .505 -.091 -.063 -.291
2B -.425 -.053 .252 .429 .106 .141
3B -.026 .003 .043 -.007 .058 -.015
HR -.492 -.265 .079 .803 .044 .237
More Result Correlations
Result DB GB LD HD FB PU
AVG -.488 .217 .562 .292 .081 -.044
OBP -.581 .161 .519 .421 .124 .048
SLG -.632 -.039 .391 .680 .109 .154
BABIP -.385 .219 .584 .253 .009 -.160
BACON -.517 .104 .544 .502 .027 -.052
wOBA -.671 .064 .479 .604 .136 -.133

The DB, HD, and PU classes of batted balls have strong year to year correlations, while the LD class does not appear to be stable. This is important to keep in mind, since the LD class contains so many singles. This is your "DIPs" theory for BABIP showing its ugly face. But notice, the HD class of batted balls are indeed very stable, and as a result so are the stats that are indirectly associated with this figure, such as slugging percentage and wOBA.

In theory, this HD class and the Barrels stat are very similar. Especially when reduced to this 2D projection. They are somewhat different, though. They are defined in different ways, the HD class is a bit more picky about which batted balls are included and excluded, for example. The differences will become even more apparent when this is extruded out into the full 3D mapping, which I am working on completing. Hopefully that will come soon, but I figured this update will be nice in the meantime.