Posts

Evaporating Errors, Part 1

Image
MLB fielders have gotten A LOT better at their job in the past five years - at least, according to one key metric. In 2019, there were 2,900 errors across baseball, working out to an average of 1.19 per game . In 2024, there were 2,600 (1.07 per game), a more than 10% decline in 5 years. In 2023, at the recent nadir of this trend, there were just 2,500 (1.04 per game), a 13% decline.  Total Errors Per Year (2015 - 2024, AKA The Statcast Era) Errors Per Game (2015 - 2024, AKA The Statcast Era) Note: Here on in this analysis, "Per Game" can more intuitively be read as "Per Play Ball!". Think of the denominator here not as the number of errors a given team will make in a game, but the number of errors you can expect to see when you show up to the park (in other words, all the errors committed over the course of the whole game by both teams).  Indeed, there was plenty of chatter in 2023 about what exactly what was going on with official mistakes in MLB.  Kyle Glaser ove...

Diamond Plots and Diamonds In The Rough, Part 5

Image
Let's put a bow on our holiday season examination of prospect values (see what I did there?) by retreating ever so slightly to the comforts of a simple scatter plot.  In the third part of our series , we built out the list of z score values for the hitting and pitching prospect groups of all 30 MLB teams from 2019 - 2024 (data courtesy of Fangraphs). That full list of 180  z scores (30 teams, 6 seasons) can be found  HERE  in my GitHub.  We then pivoted  last week  in Part 4  to roll up those z scores by team and look at cumulative success across the timeframe. Let's undo that quickly and go back to looking at all those 180 team-season combos in graph form. This way, we can really see how hitter and pitcher z scores interact with one another, further informing our understanding of how teams build their farm systems. To start, let's just simply put all 180 team-seasons on a graph, like so:  I've added in lines that bisect 0,0 so we can see the...

Diamond Plots and Diamonds In The Rough, Part 4

Image
Happy holidays to all! Let's celebrate with some data visualizations.  Last week we took a step further into the world of z scores and prospect values. Examining the 2019 - 2024 timeframe with data from Fangraphs, we looked at the relative rank of each organization's class of prospects by year via z scores. Normalizing prospect values this way then helped us make comparisons across years, adding a new angle of analysis and helping us celebrate the very best of the best (and, at times, the worst of the worst...). That full table can be found  HERE  on my GitHub.  Let's take that csv into the shop for an upgrade today. First, we'll sum up each of the z scores by organization, to provide a sort of "total prospect value" across the timeframe. These "Total Prospect Z Scores" give us a window into which teams were the most successful at developing prospects from 2019 - 2024. The higher an organization's cumulative z score, the more they outperformed t...

Diamond Plots and Diamonds In The Rough, Part 3

Building off of last week's post - where we moved the lens towards MLB farm system value, not just prospect count - let's take a look at z scores by organization and by year. Here, we'll run the same exercise as we did last week, creating a z score value (number of means above the standard deviation) but for each year, getting us 30 teams x 6 years = 180 observations of z scores.  Critically, each z score is anchored to the season in question. That is to say, the value of farm system's hitting, pitching and total prospect count is benchmarked to the league-wide values from that year alone. I think measuring z scores within years like this would more accurately help us track population-level changes in prospecting. Normalizing within each year ultimately helps us make comparisons across years that reflect newfound understanding of valuation at a macro level.  For instance, comparing the nominal values of SFG's 2020 hitting prospects ($191.5M) and Milwaukee's 202...

Diamond Plots and Diamonds In The Rough, Part 2

Image
Last week , we introduced diamond plots and very quickly looked at applying the visual to MLB farm system composition. From that post: "Let's use diamond plots to get a sense of the varying ways MLB farm systems are built, namely, the number of (ranked) hitters vs. (ranked) pitchers in a given team's system. This will be a quantity exercise to start - the value of those players will be the focus of next week's post. We'll use the Farm System Rankings from Fangraphs 2024 Preseason Prospect Report  to get the breakdown of each system. Remember, these are essentially the prospects of note in a system (ranked with a 35+ FV or higher), not a comprehensive look at how many players each organization has rostered." Here's what we got:  Hard to quibble with it from a visual perspective - nice even spread, some clear winners and losers, and even some high-level trends to parse out. Be sure to check out my post from last week for some more thoughts there.  Which brin...

Diamond Plots and Diamonds in the Rough

Image
Well, blog, it's been awhile! Here's a quick one to get back on the board. My hope is to make posts like these much more of a regular occurrence, though, so stay tuned!  Today we'll be working with diamond plots in R. This post is much more proof of concept than anything earth-shattering, but the reps are needed, so away we go.  First, thanks to Owen Phillips of the F5 for his "How To: Diamond Plots in R" post  which I followed to a significant degree to build this.  Traditional X-Y plots are not necessarily the most intuitive to compare data points intra-graph. By flipping the plot 45 degrees, we better align the graph's visuals with our own human nature - namely, making quality judgements top to bottom in a descending order.  Here, let's use diamond plots to get a sense of the varying ways MLB farm systems are built, namely, the number of (ranked) hitters vs. (ranked) pitchers in a given team's system. This will be a quantity exercise to start - the ...