November 19, 2008
A quick look at baserunning
You're probably looking around going, "Where's my roundtable?" And you will have a roundtable. Probably Friday. In the meantime, I'm laying out some finger sandwiches and lemonade - a light afternoon snack, if you'd like. Partake if you wish.
So I have a baserunning evaluation metric, measured in runs above/below average. Nothing fancy or special, really. Dan Fox has covered this ground a lot better than I have. (And that's just the tip of the iceberg.) So here's how I dos it:
- Start with Retrosheet play-by-play data.
- Calculate run expectancy separately for each base, like this, for each season.
- Looking only at the lead baserunner, calculate the average destination run expectancy for each event. Everything was broken down by the following categories:
- Number of outs remaining,
- Event code (single, double, out, wild pitch, etc.),
- Batted ball type,
- Whether the batter was bunting,
- Whether the ball was hit to the battery (pitcher/catcher), an infielder or an outfielder,
- Whether the ball was hit to the left or right side of the field.
- Compare what a player did to the average.
Let's say you have a runner on first, no outs. Most of the time a runner ends up on second, some of the time on third, when a ground-ball single is hit into left field. If a runner ends up on second, he gets a (very slight) debit. If he ends up on third, he gets a credit. All of these changes are tracked and totaled up.
Simple and easy, right? Here's the top ten baserunning +/- seasons, 1953-2007:
|
YEAR_ID
|
PLAYER_ID
|
Name
|
TEAM_ID
|
PLUS_MINUS
|
|
1965
|
flooc101
|
Curt Flood
|
SLN
|
12
|
|
1976
|
patef101
|
Freddie Patek
|
KCA
|
12
|
|
2004
|
erstd001
|
Darin Erstad
|
ANA
|
11
|
|
1991
|
molip001
|
Paul Molitor
|
MIL
|
10
|
|
1978
|
puhlt001
|
Terry Puhl
|
HOU
|
10
|
|
2000
|
goodt001
|
Tom Goodwin
|
COL
|
10
|
|
1987
|
browj001
|
Jerry Browne
|
TEX
|
10
|
|
1974
|
bochb001
|
Bruce Bochte
|
CAL
|
10
|
|
1957
|
blasd101
|
Don Blasingame
|
SLN
|
10
|
|
1976
|
leflr101
|
Ron LeFlore
|
DET
|
10
|
You'll note that the best baserunning season of the Retroera was only worth 12 runs above average. Obviously you'd prefer a good baserunner to a bad baserunner, all else being equal, but it definitely takes a backseat to hitting and defense.
Ten worst seasons?
|
YEAR_ID
|
PLAYER_ID
|
Name
|
TEAM_ID
|
PLUS_MINUS
|
|
2007
|
lodup001
|
Paul Lo Duca
|
NYN
|
-9
|
|
1959
|
thomf103
|
Frank Thomas
|
CIN
|
-9
|
|
1980
|
cruzj001
|
Jose Cruz
|
HOU
|
-9
|
|
1965
|
johnd103
|
Deron Johnson
|
CIN
|
-9
|
|
1962
|
brutb101
|
Bill Bruton
|
DET
|
-9
|
|
1976
|
sizet101
|
Ted Sizemore
|
LAN
|
-10
|
|
1974
|
darwb101
|
Bobby Darwin
|
MIN
|
-10
|
|
1999
|
stanm002
|
Mike Stanley
|
BOS
|
-10
|
|
1965
|
fairr101
|
Ron Fairly
|
LAN
|
-10
|
|
1964
|
bertd101
|
Dick Bertell
|
CHN
|
-13
|
UPDATE: This is too large for an EditGrid, so here's a full spreadsheet, including career totals. Requires something that can read Excel files. Best I can do for y'all right now.
Discussion
5 Comments on "A quick look at baserunning"
#2
Posted by Pizza Cutter, November 19, 2008 3:15 PM
But he must be a good and valuable player... he's fast!
#3
Posted by Colin Wyers, November 19, 2008 3:22 PM
I'm reticent to add any more adjustments the way I currently do it, Brian, because then you start to really shave down the sample sizes on the state-to-state transitions. Especially since I'm doing it season-by-season, it starts to get dangerous if you drill down anymore. I'm sure there's a way to handle it better, but a lot more work would have to go into it.
And sportwriters say stuff like that, PC, but when it comes down to brass tacks, like the MVP award, they vote a Big Damn Slugger with no other positives second. It's all lip service.
#4
Posted by dan, November 20, 2008 12:37 AM
"they vote for a Big Damn Slugger"
Like Dustin Pedroia?
I agree with you for the most part btw, I'm just giving you a hard time.
#5
Posted by Brian Cartwright, November 21, 2008 12:26 PM
This is an idea that I've actually had for close to 25 years, since I kept statistics and had all the play by play for a college summer league, but it's still on my to-do list.
The way I have it conceptualized, if there's a rare grouping of events, the expected value will not be as accurate because it's based on a much smaller sample size - but, if the player's samples are weighted by how often that player is in each situation, then the effect in the final rating of a larger variance in the expected value of any subgroup will be minimized by the weighting.
So, of any groupings that you have, calculate their expected rate (the league mean over x number of seasons). Find the number of times that a player was in each situation, and calculate the player's weighted overall expected rate, then compare to the player's observed rate, and convert to runs.













Leave a comment