xRV: The best pitches and pitchers of 2020

Earlier this summer, I informally introduced a pitch quality metric called Expected Run Value (xRV). This was in a piece for South Side Hit Pen, Sports Illustrated’s White Sox site. The article analyzed two unsung heroes of Chicago’s bullpen, Evan Marshall and Jimmy Cordero, as measured by various metrics including xRV. The basis of xRV started as a summer research project from Dr. Ryan Baranowski of Coe College, my classmate Jared White, and myself. 

Over the last few months, pitch quality metrics have been a hot topic, so I wanted to formally outline my model, and how it can be useful in analyzing MLB pitchers. The model provides various application methods that I want to address and with the 2020 regular season wrapped up, I can now apply it to the entirety of the abbreviated season. The important aspects are the steps to build the model, an analysis of the model’s results, and finally how we can apply the results to pitcher evaluation and future decision making.

The process starts with Bill Petti’s convenient Savant scraper function within the baseballr package. This gives us the metrics on every pitch thrown from the 2020 regular season that were needed for the model. After getting rid of the random events that did not help us measure pitch quality, there were 244,681 remaining pitches.

Each pitch was evaluated based on its run value, which was calculated differently based on whether or not the ball was put in play. The run value of each strike is the difference between the run value of the resulting count and the run value of the resulting count had the pitch been called a ball, and vice versa for called balls. Back in 2014, this way of thinking by Dan Brooks and Harry Pavlidis spawned a run value by count matrix. From this, we can figure that in 2020, the most consequential strike for a pitcher was a strike on a full count (-.294 runs), and the least consequential was a strike on a 1-0 count (-.035 runs). Similarly for balls, the most consequential ball was also on a full count (.234 runs), and the least consequential was on an 0-2 count (.021 runs).

Why does this matter? Because we don’t want to penalize a pitcher for throwing a ball in an 0-2 as much as we do when he throws that same ball in a 3-2 count. This context matters as it relates to run expectancy.

Furthermore, after a ball is contacted, we should not care about the resulting outcome — whether or not a ball in-play was a double in the gap, or an out with an optimal defensive alignment. The same is true whether or not a batted ball ended up as a fly out to the warning track in a big park, or a home run in a smaller one. We therefore convert the batted ball’s xwOBA to a run value to measure quality of contact. This is done by subtracting the average wOBA of each count from the xwOBA value and then scaling by the 2020 wOBA scale. 

To account for other factors that influence the effectiveness of a pitch, dummy variables were added for right handed batters, right handed pitchers, and fastballs, respectively. For handedness, the dummy variable accounts for the platoon advantage while also accounting for the measurements of pitches breaking in vs. breaking away from the hitter. The fastball dummy variable is also important due to the amount of heterogeneity in fastballs. We don’t want the model to project Kyle Hendricks’ fastball as a good changeup based on his “stuff” metrics, for example. Pitchers generally base their arsenals on the quality and characteristics of their fastball, so knowing that base is important. Therefore, the fastball dummy variable also accounts for timing. Hitters can be late against Hendricks’ fastball not because they can’t handle 87 mph, but because of his secondary offerings. We don’t want to trick the algorithm.

Finally, to keep the model agnostic outside of each pitch’s physical characteristics, we did not want to label the pitches, either in buckets or individually.

The machine learning algorithm used was a random forest that was trained on a 50,000 observation sample, roughly 25% of the data set after removing NAs. The attributes that the model was trained on were a combination of “stuff” metrics (velocity, raw spin rate, vertical movement, horizontal movement) and command metrics (vertical location and horizontal location), combined with the aforementioned dummy variables. 

After confirming with a feature selection algorithm that all of the “stuff” and command attributes were significant at the 0.01 level, the model was ready to be run.

The top pitches of 2020 are listed below (min 50. pitches of type)

RankPlayerTeamPitch TypePitchesxRV
1Garrett CrochetWhite Sox4-Seam Fastball70-.044
2Joely RodriguezRangersChangeup60-.042
3Pierce JohnsonPadresCurveball157-.039
4Tyler AlexanderTigersCutter80-.039
5Framber ValdezAstrosCurveball318-.038
6Chaz RoeRaysSlider85-.037
7Aaron LoupRaysSinker154-.037
8Devin WilliamsBrewers“Changeup”219-.036
9Joe KellyDodgersKnuckle Curve98-.036
10Daniel BardRockiesSlider113-.035

While a couple of these names may catch you by surprise, it should be known by now how nasty Garrett Crochet’s fastball is. The pitch averaged 100.1 mph with a spin rate of 2501 RPMs and had the 8th highest rise of any 4-seamer in baseball. The stuff metrics were so good that the fact that he tended to throw the pitch right down the middle simply did not matter. 

When he did happen to locate it well… good luck.

Valdez, Roe, Williams and Kelly’s pitches on this leaderboard likely aren’t a surprise to most, but how about 2-3-4? 

Joely Rodriguez spent much of 2018 and all of 2019 playing in Japan after control issues in 2017 with multiple clubs. It appears as though he may have learned a really good changeup overseas. He consistently kept it below the knees of hitters, and he paired it with a fastball that was a top-50 pitch according to xRV.

The 88 4-seamers that he threw ranked 41st (97th percentile) among all pitches. The pitch’s velocity was in the 56th percentile, but its spin rate was only in the 10th percentile. He did, however, have impact horizontal movement that the algorithm seemingly liked, plus plenty of pitches in the shadow and chase zones. He did also have plenty of center-cut fastballs, but this is where his changeup could be helping his fastball, and this phenomenon is something that future iterations of this could try to account for.

Pierce Johnson is another reliever that returned from a stint in Japan with a new pitch. He changed the profile of his curve, making it slurvier with a velocity jump that followed his fastball. His curveball Whiff% was top-20 in baseball and the Padres have him throwing it more often than not. 

After Johnson, it is very interesting that Tyler Alexander’s cutter is fourth on this list. It was his fourth most frequent pitch and his third most frequent fastball variation. His 4-seamer was good, but his sinker was not. Unlike the three pitchers ahead of him on this leaderboard, Alexander did not have the same kind of overall success (lots of blue ink on his savant page) — well, except for those 3 perfect innings. But, he only threw 7 cutters out of 39 pitches through those consecutive strikeouts. 

He did throw Moustakas a nasty one to get his first strikeout of the streak.

He typically attacked lefties with a slider/cutter combo, and it worked. A .225/.244/.275 slash will play. Alexander has always been a command over “stuff” pitcher (more on this later), and he looks to have run into some fly ball bad luck this year, meaning he could be an arm to watch in 2021. His cutter was usually down and away from the middle of the zone with very infrequent meatballs. Even as a “command guy”, he was in the 74th percentile of horizontal break with his cutter, so not bad. The last thing on Alexander is that his cutter is yet another new pitch added this season.

Devin Williams’ “changeup” really isn’t a changeup. I don’t believe a 2850 RPM changeup is allowed, so it should probably be labeled as a screwball moving forward, as many have suggested. Nonetheless, its nastiness has been well documented this year, and it has quite possibly made him the best reliever in baseball.  

You may think 50 pitches is too low of a minimum to confidently judge its qualities, so let’s bump it up by 100 and include more names. 

RankPlayerTeamPitch TypePitchesxRV
1Pierce JohnsonPadresCurveball157-.039
2Framber ValdezAstrosCurveball318-.038
3Aaron LoupRaysSinker154-.037
4Devin WilliamsBrewersChangeup219-.036
5Aaron NolaPhilliesKnuckle Curve278-.035
6Julio UriasDodgersCurveball173-.034
7Caleb FergusonDodgers4-Seam Fastball216-.034
8Ryan YarbroughRaysCutter277-.034
9Carlos CarrascoIndiansChangeup289-.034
10Kevin GausmanGiants4-Seam Fastball434-.033
11Yu DarvishCubsSlider163-.033
12Caleb BaragarGiants4-Seam Fastball239-.033
13Nik TurleyPirates4-Seam Fastball190-.032
14Trevor BauerReds4-Seam Fastball437-.032
15Jake McGeeDodgers4-Seam Fastball280-.032
16Liam HendriksAthletics4-Seam Fastball242-.032
17Jacob deGromMets4-Seam Fastball469-.032
18Gerrit ColeYankees4-Seam Fastball539-.031
19Clayton KershawDodgersSlider286-.031
20Lance McCullers Jr.AstrosKnuckle Curve312-.030
21Brandon WoodruffBrewers4-Seam Fastball362-.030
22Julio UriasDodgers4-Seam Fastball458-.030
23Jaime BarriaAngelsSlider211-.030
24Pete FairbanksRaysSlider185-.030
25Max FriedBravesSlider162-.030

This leaderboard can also be subsetted by pitch type.

4-Seam Fastballs

1Caleb Ferguson Dodgers216-.034
2Kevin GausmanGiants434-.033
3Caleb BaragarGiants239-.033
4Nik TurleyPirates190-.032
5Trevor BauerReds437-.032
6Jake McGeeDodgers280-.032
7Liam HendriksAthletics242-.032
8Jacob deGromMets469-.032
9Gerrit ColeYankees539-.031
10Brandon WoodruffBrewers362-.030


1Aaron LoupRays154-.037
2Jesus LuzardoAthletics184-.029
3Trevor RogersTwins1685-.028
4Framber ValdezAstros550-.028
5Ryan ThompsonRays212-.027
6Chris BassittAthletics338-.026
7Eric YardleyBrewers237-.026
8Dustin MayDodgers421-.026
9Kyle HendricksCubs380-.026
10German MarquezRockies173-.026


1Ryan YarbroughRays277-.034
2Josh TomlinBraves271-.029
3Lance LynnRangers283-.029
4Yu DarvishCubs474-.029
5Colten BrewerRed Sox196-.028
6Dustin MayDodgers192-.027
7Nathan EovaldiRed Sox218-.027
8Shane BieberIndians192-.026
9Madison BumgarnerDiamondbacks226-.026
10Dallas KeuchelWhite Sox280-.024


1Devin WilliamsBrewers219-.036
2Carlos CarrascoIndians289-.034
3Zach DaviesPadres409-.027
4Zac GallenDiamondbacks203-.027
5Sean ManaeaAthletics217-.026
6Matthew BoydTigers179-.025
7Matt AndrieseAngels175-.025
8Hyun Jin RyuBlue Jays292-.024
9Kyle FreelandRockies273-.024
10Jesus LuzardoAthletics218-.022


1Yu DarvishCubs163-.033
2Clayton KershawDodgers286-.031
3Jaime BarriaAngels211-.030
4Pete FairbanksRays185-.030
5Max FriedBraves162-.030
6Brad KellerRoyals301-.029
7Zach EflinPhillies173-.028
8Jacob deGromMets368-.027
9Justus SheffieldMariners263-.027
10Kenta MaedaTwins349-.027

Curveballs/Knuckle Curves

1Pierce JohnsonPadres157-.039
2Framber ValdezAstros318-.038
3Aaron NolaPhillies278-.035
4Julio UriasDodgers173-.034
5Lance McCullers Jr.Astros312-.030
6Drew SmylyGiants164-.029
7Tyler GlasnowRays310-.026
8Sonny GrayReds259-.026
9Gerrit ColeYankees175-.025
10Jordan MontgomeryYankees153-.025

Another implication of this model is that we can look at a pitcher’s arsenal while weighing their xRV results based on pitch type usage. This should tell us more about who was the most effective pitcher of 2020 by optimally utilizing their best pitches. This could also serve to credit their teams for maximizing their potential output. For this leaderboard, I arbitrarily restricted total pitches to a minimum of 400. Only pitches that were thrown a minimum of 50 times were considered part of the pitcher’s arsenal.

Top 2020 Arsenals

1Framber ValdezAstros966-.030
2Tyler GlasnowRays843-.028
3Julio UriasDodgers741-.028
4Clayton KershawDodgers733-.027
5Dustin MayDodgers718-.027
6Jacob deGromMets1,009-.027
7Gerrit ColeYankees1,011-.027
8Josh StaumontRoyals416-.026
9Yu DarvishCubs1,023-.026
10Pete FairbanksRays444-.025
11Walker BuehlerDodgers465-.025
12Drew SmylyGiants445-.025
13Trevor BauerReds1,073-.025
14Nathan EovaldiRed Sox685-.025
15Ryan YarbroughRays773-.024
16Kevin GausmanGiants880-.024
17Garrett RichardsPadres743-.024
18Carlos CarrascoIndians982-.024
19Josh TomlinBraves510-.023
20Carlos EstevezRockies405-.023
21Charlie MortonRays577-.023
22Trevor RogersMarlins492-.023
23Tarik SkubalTigers485-.022
24Jesus LuzardoAthletics864-.022
25James KarinchakIndians440-.021

Takeaways? A third of these pitchers are either Dodgers or Rays, the best teams in their respective leagues through the 2020 regular season.

Framber Valdez had the best arsenal in baseball when considering his curveball (100th percentile), his sinker (93rd percentile), and his changeup (50th percentile). His curve was a massive whiff generator while hitters pounded the latter two pitches into the ground. His curveball had the 23rd highest Whiff% (min. 100 pitches) while his sinker had the 13th highest GB% (min. 50 BBEs), and his changeup had the 15th highest GB% (min. 15 BBEs).

The rest of the leaderboard consists of some NL Cy Young hopefuls, a couple of power relievers, and intriguingly, 5 pitchers who could see the open market this winter in Smyly, Bauer, Gausman, Richards and Morton. 

The model as presented combines both “stuff” and command pitch attributes, but what if we want to try an isolate effective command? By giving every pitcher average stuff, we can see how well their command would still allow them to minimize xRV. While this method is probably more crude compared to other attempts to quantify command, holding “stuff” constant can still provide added context to some of the results seen in the above leaderboards. 

For example, in an at-bat where a right handed pitcher is throwing to a right handed batter, each pitcher would be equipped with a 93.9 mph 4-seam fastball with a 2317 RPM spin rate. We can analyze pitchers that would do the best with “stuff” metrics in the 50% quantile. 

Remember how Tyler Alexander’s unassuming cutter was the 4th best overall pitch? This leaderboard will help explain why:

Holding Stuff Metrics Constant – Effect of Command on xRV (By Pitch Type)

RankPlayerTeamPitch TypePitchesxRV
1Tyler AlexanderTigersCutter80-.036
2Michael LorenzenRedsCutter106-.033
3Charlie MortonRaysCutter57-.032
4Ross StriplingDodgers/Blue JaysSlider110-.032
5Wade LeBlancOriolesCutter94-.032
6Brett AndersonBrewersKnuckle Curve75-.031
7Tyler ClippardTwinsChangeup138-.031
8Mike KickhamRed SoxSlider121-.031
9Jeffrey SpringsRed SoxSlider107-.031
10Zack GreinkeAstrosCurveball108-.030

By xRV, Alexander’s cutter was the best commanded pitch in baseball. With two-third of his cutters thrown against high-handed hitters, this is where he put them.

Giving Alexander an average cutter as a left-handed pitcher thrown to a right-handed hitter, it would be a 87 mph pitch with a 2259 RPM spin rate, which is not too dissimilar from his. However, it wouldn’t have 74th percentile horizontal movement, as his does. Even if all of its “stuff” metrics were average, it would still be a top-10 pitch in baseball.

In terms of pitcher arsenals, we can do the same exercise to isolate each pitcher’s command as an effect on xRV. The same qualifications apply when looking at arsenals — a pitcher must have thrown at least 400 pitches and each pitch type thrown at least 50 times is considered.

Holding Stuff Metrics Constant – Effect of Command on xRV (By Pitcher Arsenals)

1Josh TomlinBraves510-.023
2Clayton KershawDodgers733-.023
3Nathan EovaldiRed Sox685-.022
4Yu DarvishCubs1,023-.022
5Carlos EstevezRockies405-.021
6Charlie MortonRays577-.021
7Ross StriplingDodgers/Blue Jays754-.021
8Zach PlesacIndians715-.020
9Drew SmylyGiants445-.020
10Gerrit ColeYankees1,011-.020
11Deivi GarciaYankees437-.020
12Jordan LylesRangers902-.020
13Dylan BundyAngels956-.020
14Thomas EshelmanOrioles466-.020
15Chris BassittAthletics856-.020
16Aaron CivaleIndians1,108-.020
17Mike FiersAthletics861-.019
18Kevin GausmanGiants880-.019
19Kyle HendricksCubs1,079-.019
20Michael WachaMets548-.019
21German MarquezRockies1,149-.019
22Zach WheelerPhillies983-.019
23Travis Lakins Sr.Orioles427-.019
24Dustin MayDodgers718-.019
25Matt WislerTwins411-.019

Here’s where Tomlin located his top-three pitches.

It would be a challenge to locate 95% of your pitches any better. The model showed Tomlin as having just about average “stuff” anyway, so this exercise did not have much effect on his arsenal’s overall xRV. He had a -0.29 xRV on his cutter that ranked 76th among all pitches (95th percentile), but his curveball and 4-seamer did not fare nearly as well. This underlies the typical importance of above average stuff even with elite command.

The results of the different aspects of the model pass the eye test, but I wanted to do my due diligence in testing its performance. To do so, I tested the model’s Root Mean Squared Error (RMSE), which was 0.063, a value that affirmed my confidence in the model. The RMSE is on the same scale as the dependent variable, which in this case is run value.

With a model that I’m confident in, I still realize that it’s not perfect.

A pitcher’s fastball can play up due to the quality of his secondary offerings. Moving forward, more specific feature selection for each pitch type could further capture the expected effectiveness each pitch. Other aspects like sequencing, release point and pitch tunneling undoubtedly play a role in a pitch’s quality, so an attempt to control for these factors in future model iterations is something I will need to look into. The key is to account for as many factors that make sense as possible given what we are trying to measure. Otherwise, overfitting is the issue. We don’t want a context-neutral model, but we also do not want to include every factor surrounding the pitch even though we can measure it.

As for the model’s application, my thoughts are that this could help a team optimize their pitcher’s pitch usage based on xRV. With a potentially more accurate measurement of expected performance, affecting the roles in which certain pitchers are used is another implication that could end up affecting a pitcher’s value. For each pitcher, there are more specific improvements within pitch design that teams will look to create, but this model should provide a more detailed overview of performance. 

In the near future, I plan to apply a version of this model to NCAA D1 pitches through my continued work with BaseballCloud. This will be after controlling for the significant differences between the two levels.

Leave a Reply

Powered by WordPress.com.

Up ↑

%d bloggers like this: