For the link to my code, my GitHub page is here.
The launch angle revolution has changed the way baseball teams develop hitters. There have been many hitters that have raised their launch angles and had success such as J.D. Martinez, Max Muncy, and, just recently, Vladimir Guerrero Jr. However, throughout the launch angle revolution, with the increase of strikeouts and decrease of batting average, many people have questioned this philosophy. This train of thought goes against the traditional thought of being a “slap” hitter, guys that try to hit line drives over infielders’ heads.
However, evidence has shown that fly balls create more value, on average, compared to any other type of hit. I feel that it was especially important to note in that last sentence that I said “on average”. This means that the typical hitter will have more success when they hit more fly balls. However, not all hitters are average and this is one problem that I have wondered about with this revolution of hitting.
What do I mean by this? Well, this average I am talking about is in terms of exit velocity. There are guys like Aaron Judge, Shohei Ohtani, and Nelson Cruz who crush the ball when they hit it, but there are also guys like Whit Merrifield and Luis Arraez who have been able to play well by hitting liners over fielders’ heads. These two types of hitters are very different and lead me to the question: Should hitters with different average exit velocities try to hit the ball at the same launch angle or different?
A Tale of Two Hitters
To show an example of what I mean, I wanted to do a comparison of two hitters: Aaron Judge and Tim Locastro. Now, these two may seem like completely different hitters, which they are, however, from 2019 to so far in 2021, they both have median launch angles of around 12 degrees. I used the median here as the measure of the center as the launch angle can be skewed easily by outliers.
The big difference between these two is their median exit velocities; Judge had a median exit velocity of 98.2 whereas Locastro was at 84.6. This is a huge difference that begs the question; what is the optimal launch angle in which they will perform at their best?
My idea of what the optimal launch angle is is an angle where a player will get their best performance. The measure of performance that I wanted to use here is wOBAcon as it will give me the best measure of player performance when the player makes contact. I could have used xwOBAcon, however, with it trying to predict wOBAcon, I wanted to use the real thing instead. So to find the optimal launch angle for each player, I went into Baseball Savant to find the angle that each player will perform best, given that their median exit velocity is what they are typically hitting the ball at.
Here are the results:
*the peak wOBA for each player is highlighted
It appears that Judge will typically get his best results when he hits the ball at a 29 degree launch angle and Locastro will typically get his at a launch angle of 16 degrees. Notice that even though Judge hits the ball a lot harder than Locastro, their peak wOBAs are very similar, .905 compared to .892. Another thing to note is that Judge still performs very well at Locastro’s optimal launch angle of 16 degrees with a wOBA of .738, whereas Locastro performs miserably at Judge’s optimal launch angle with a wOBA of .085. This difference may be the reason why their overall wOBAcon’s differ greatly, with Judge having a .532 and Locastro having a .355.
This shows what I was hinting at earlier; guys who hit the ball very hard most of the time should hit the ball at a higher launch angle to take advantage of their exit velocity and get more extra base hits. However, guys that do not hit the ball as hard, like Locastro, may struggle at higher launch angles which will lead to lazy fly balls. According to this, Locastro should be trying to hit line drives that drop in front of outfielders or head into the corners.
To see how important it is for these hitters to hit at their optimal launch angles, I will show you two videos of each player hitting at their own launch angle and two videos of each hitting at the other player’s optimal launch angle. I will also add the xwOBA and xBA numbers next to them to show how valuable each hit was. Let’s start with Judge.
* Exit Velo: 98.2, Launch Angle: 15, xwOBA: .682, xBA: .693, Distance: 269 feet
* Exit Velo: 98.8, Launch Angle: 31, xwOBA: .711, xBA: .394, Distance: 377 feet
The first hit definitely looked better, right? Well, overall, the second one, the fly ball, had a higher xwOBA (.711) than the first one (.682), meaning that, over time, Judge will create more value out of it. Notice that the xBA is actually higher for the line drive at .693, this is because this hit will most likely land in front of the outfielders. Judge’s flyball going 377 feet, though, will end up going out of the ballpark more often, generating more runs than singles.
Now for Locastro:
* Exit Velo: 84.4, Launch Angle: 15, xwOBA: .873, xBA: .953, Distance: 225 feet
* Exit Velo: 84.1, Launch Angle: 30, xwOBA: .034, xBA: .031, Distance: 316 feet
Similar to Judge’s hits, Locastro’s line drive looked much better than the flyball and this time, in fact, it was. By a mile. The flyball had a minuscule xwOBA and xBA of .034 and .031, respectively. Meanwhile, the line drive was great; it had an xwOBA of .873 and an insane xBA of .953.
Again, the exit velocities I used for each of these hitters in these videos was their median exit velocities, meaning that they will typically hit around that number. These batted balls can give us a bit of an idea of how high they should be hitting the ball to be the best hitter they can be.
Now, did these two hitters hit the ball at their optimal launch angles often? Well, looking at their respective median launch angles compared to their optimal ones, it appears that Locastro may have, but Judge, not so much. The best way to see this would be to take a look at their launch angle distributions.
Here are each player’s launch angle distributions:
*the colored lines represent each player’s optimal launch angle
Taking a quick glance at this, Locastro most often hits the ball at around his optimal launch angle of 16 degrees, however, Judge’s typical launch angle is a bit lower than his optimal of 29 degrees. So does this mean that Locastro hits his optimal launch angle more than Judge? Not quite. We can see that Judge has much less variance in his distribution compared to Locastro, this is important because Judge’s launch angles are centered around his average launch angle whereas Locastro hits the ball at the extremes more often.
If you look at this graph a bit closer, you can see that the frequency at which he hits the ball at his optimal launch angle is higher than Locastro’s, even though Locastro’s distribution is centered around his optimal launch angle. So we can say that Judge hits the ball most often at his optimal launch angle.
There is a big problem with trying to evaluate how high players should be hitting the ball with this method, though; sample size. We are only looking at two hitters, so there are many different types of hitters out there, these are just two extremes. What we want to do is to group similar types of hitters together and compare the groups to see what launch angles are most effective for their hitter type.
Hitter Clustering and Analysis
To do this, I am going to use an unsupervised machine learning model known as k-means clustering. This model will create similar groups of hitters based off of their median exit velocities and launch angles. The data I used is seasonal data of hitters that had at least 250 batted balls from the start of 2019 to April 20th, 2021 (when my dataset ends). After I cluster the hitters, I will analyze what makes the hitters within each cluster successful and unsuccessful.
To determine how many clusters (also known as k) I needed for this analysis, we will need to use the elbow method. What it does is it runs K-means clustering several times through different numbers of K (I used 1 to 15 here) and gives us the variance in clusters. We then create a plot to see which point looks like the “elbow” (point closest to the origin) of the line graph. To measure the variance within the clusters, I will be using Within Groups Sum of Squares. This tells us the size and variance of each cluster and can help us optimize the number of clusters needed for this analysis.
Here is the elbow plot:
As you can see, it appears that the “elbow” of this arm is 3, with 4 coming close as well. This is the number of clusters we will have for our K-means clustering.
Now, let’s move on to the actual clustering! As I stated earlier, for this clustering method, I am just going to use median exit velocity and median launch angle for hitters to split them into clusters. So with that said, here is a graph of the 3 different hitter clusters:
The K-means clustering method appears to have split the three clusters by launch angle, with cluster 1 being low launch angle hitters, cluster 2 being high launch angle hitters, and cluster 3 being average. Where do our friends Locastro and Judge end up in these graphs?
So they end up in the same cluster, going back to what we said earlier, they are both very similar in terms of their actual median launch angle. So here, they end up in the same cluster. However, they really aren’t that similar of hitters, Locastro, as you can see in the graph, is light-hitting, whereas Judge smokes the ball. Due to the hitters being so different, let’s analyze these clusters individually by evaluating their different exit velocities.
To do this, I wanted to evaluate each cluster by classifying each hitter as hitting the ball “soft”, “average”, or “hard” based on their median exit velocities. To define this for each hitter, I decided I wanted to use the middle 50% of all median exit velocities for these hitters as the “average”, the top 25% as “hard”, and the bottom 25% as “soft”. I need the 25th and 75th percentile (also known as Q1 and Q3) median exit velocities from these hitters, which were 89.6 and 92.5 MPH, respectively. Here is how this looks visually:
So, as you can see, the middle 50% is not a huge range, however, this is the typical range of how hard an MLB player will hit the ball. Notice that the high launch angle hitters have the highest frequency of exit velocities in the middle 50%, whereas the low launch angle hitters have the lowest frequency.
Going back to which, let’s evaluate all of these hitters by their launch angle clusters and their exit velocity levels. The next visual here is a table that consists of all the levels I just mentioned and their performance by wOBA and xwOBA:
*The red, green, and orange represent the exit velocity levels
This is what you would typically expect when looking at exit velocities; the guys that hit the ball harder have more success, but that isn’t what we are looking for. We want to see what launch angles are more successful for certain exit velocities. By looking at the different exit velocity levels, we can see which launch angle cluster performed the best.
As you can see in the fast and slow exit velocity levels, hitting the ball higher will typically lead to more success. A big thing to notice here is that in the average EV level, it appears that the high and average launch angle hitters perform very similarly, with .409 and .408 wOBA’s, respectively.
So what does this mean for Locastro and Judge? Well, as we saw earlier, Locastro’s average launch angle was 12 degrees and he hit the ball at around 84.6 MPH, meaning he was a “soft” EV hitter and in cluster 3 for launch angles. Looking at the table, it could benefit Locastro to hit the ball higher, other players in the league that hit the ball higher at that exit velocity have more success. For Judge, it is the same thing as he would be in cluster 3 and had a high exit velocity at 98.2 MPH. Hitting the ball higher could lead to more extra-base hits and home runs for both of them.
How can this analysis affect how we develop hitters? This analysis, I feel, makes us realize that hitters should be individualized when building their swings, attack angle, and this could change according to how hard they hit the ball. However, there is still good reason to hit the ball in the air more often than not.
Another benefit and I think the most important to this, is that we can help determine why a hitter may be struggling to get hits even though they smoke the ball. A great example of this is Vladimir Guerrero Jr.; in 2019 and 2020, he had average launch angles of 6.7 and 4.6 degrees, respectively. This year, he has hit the ball higher in the air at 8.5 degrees and has gotten great results, being in the running for MVP and having a chance to win the Triple Crown. Spotting problems like this in hitters may dramatically affect how they perform.
Following this analysis may also increase consistent and overall better performance if hitters can consistently hit at or around their optimal launch angles. Again, this follows along with the Guerrero Jr. example. Keeping that launch angle distribution right around a hitter’s optimal launch angle should increase their performance.
So with all that said, there are definitely limitations with this analysis. The clustering and analysis do not take into account the effect that changing your swing to hit for a higher or lower launch angle has on a player’s strikeout rate. Changing a hitter’s attack angle could change how often they make contact with the baseball and lead to overall worse performance if they cannot make contact with the ball enough, even though they are hitting for a high wOBAcon.
Another limit with this is that it is not specific to each hitter; as we saw earlier, a hitter’s launch angle distribution could affect how often a player is hitting their optimal launch angle. Using launch angle distributions for individual hitters is another great approach to evaluating which clusters hitters belong in and could change the whole analysis and results.
One last limitation here is that shooting for a higher launch angle may create more efficient hitters when they can hit for the best combination of launch angle and exit velocity instead of trying to hit around a certain lower launch angle as they will hit for more power and extra-base hits.
If I were to do this project again, I would want to do three things. First, take into account the tendencies of a hitter to pull the ball, go opposite field, etc. This would give me a better idea of how shifts may affect the optimal launch angles of hitters. Heavy pull hitters like Joey Gallo may want to hit the ball higher for more home runs as the part of the field that he hits to is tough to get hits and gaps due to the amount of coverage. Meanwhile, guys that spray the ball, like DJ Lemahieu, may want to hit with a lower launch angle to try to hit the ball over the infielders’ heads.
Secondly, it would be great to be able to have actual swing data from technologies such as Diamond Kinetics and K-Vest to see how the overall swing affects hitters’ performance when they put the ball in play and if certain changes lead to more swing and miss. This obviously could be tough data to get, however, it would, in my opinion, help this analysis out a ton.
Finally, I would also like to create a model that can actually predict the true optimal launch angle for hitters. This could give an exact number to how high or low hitters should be hitting the ball and help them develop to hitting around the number consistently.
Typically, hitters should be hitting the ball with a higher launch angle than not. However, there are some hitters that may benefit from hitting more line drives instead of fly balls due to how hard they can hit the ball. If a hitter cannot reach a high exit velocity very often, then it may be in their best interest to hit more line drives as their fly balls most often won’t leave the yard and they will just be lazy fly balls. This should be evaluated on hitters individually and these clusters help hitters decide where they are at and how they should change.