ElViento: No sport lends itself to statistical analysis quite like baseball. In baseball, individual performance can be quantified independent of the performance of one’s teammates much more easily than in other sports.* A quarterback depends on his offensive line and wide receivers (and offensive coordinator, etc.) to put up big passing numbers, whereas a .300 hitter can hit .300 even if placed in a lineup with eight total scrubs.
*Which makes it so wacky that the stats that are teammate-dependant (Runs, Runs Batted In, Wins for pitchers) are among the most-cited stats when determining things like MVP and Cy Young awards. But I digress.
So-called sabermetrics haven’t gotten much of a foothold in the college game yet, which is probably why you see things like rampant sacrifice bunting in a league in which you are allowed to use a metal bat and a designated hitter. Sacrifice bunting is usually a dumb idea in lower-scoring, wood-bat, pitcher-hitting leagues, and if you don’t believe that, you’ve clearly never seen a run expectancy chart.
Anyway, I’m off-topic again. So before I start talking about something completely random, like how good Zombieland was (fucking amazing), let’s get down to brass tacks.
College baseball is harder than professional to analyze statistically, because a single season is such a small sample size (50-some games, as opposed to 162), and because players have careers which are so much shorter (max four years, as opposed to 10+ years). Still, I think there are a few concepts of sabermetrics which might be interesting to apply to the Cougars and Conference USA in order to determine what we can expect from the upcoming season. And some of the following won’t be sabermetrical at all, but will simply be another way of looking at numbers. Feeling nerdy? Let’s dive right in.
Idea #1 – BABIP
Yup, we’re getting math-y right away, with a statistic called BABIP, or Batting Average on Balls In Play. The concept possibly first devised by a dude named Voros McCracken (which practically has to be a pseudonym) is that pitchers can only control certain things, like the frequency with which they strike opposing hitters out, walk opposing hitters, and give up homeruns. Statistical analysis suggests that the numbers a pitcher puts up in these areas stay relatively steady from year to year, and true, lasting improvement has to come from improving upon these numbers. Conversely, if the opposing hitter puts a ball in play (doesn’t walk, doesn’t strike out, doesn’t hit it out of the park), a pitcher has very little control over what happens. Often times an MLB pitcher will have allow one of the highest BABIPs in the league one year, and one of the lowest the next, even though his controllable, peripheral statistics remain constant, perhaps just due to sheer, dumb luck. So if we can expect pitching staffs to allow a BABIP that is near the mean for the league, we can expect teams that allowed flukey low BABIPs to struggle a little more this year, and teams that allowed unluckily high BABIPs to benefit from a regression to the mean. The same concept applies to hitting. (For the following analysis, I will use the formula BABIP = (H-HR)/(AB-HR-K)
I accounted for all statistics accumulated by all C-USA schools, both hitting-wise and pitching-wise. For hitters, East Carolina was the “luckiest” team with a .385 BABIP, and Central Florida was the unluckiest at .325. Eliminating the two outliers, C-USA hitters put together a .344 BABIP. (For reference purposes, all of MLB put up a .303 BABIP in 2009.)
For pitchers, Tulane had the luckiest staff (.303 BABIP) and UCF was again the unluckiest (.390). If we eliminate the outliers, we get a C-USA .334 BABIP.
So aside from the extreme examples, C-USA hit and pitched BABIPs that are just .010 different, which is a difference of one hit every 100 at-bats. That difference is probably due to C-USA having slightly better defense, on average, than its opponents. (Defense is the X-factor here. Teams with better defenses will allow lower BABIPs, because their defensive players will get to more balls. Hence, a team like Rice will continue to have slightly “lucky” looking BABIP every year, because their defense is just awesome.)
What it means: The teams that seemed to be significantly lucky or unlucky in terms of BABIP in 2009 are as follows: ECU (lucky hitters), Rice (lucky hitters [.366 BABIP] and pitchers [.310 BABIP]), UCF (unlucky hitters and pitchers), Houston (unlucky pitchers [.359 BABIP]). So, one might expect the unlucky teams to fare better in 2010, and the lucky teams to do a bit worse.
Idea #2 – Pythagorean W/L
A slightly simpler concept than BABIP, Pythagorean W/L is predicated on the radical idea that teams which outscore their opponents over the course of the year will generally win more often than not. By way of example, if a team scores the same number of runs that it allows, but wins 60% of its games, chances are that said team got pretty lucky.
The simplest formula for determining a team’s Pythagorean (or expected) winning percentage is (RS^2)/(RS^2+RA^2). Taking on the league as a whole, C-USA played to a 286-245 overall record, outscoring opponents 3,672-3,361. That gives us an expected record of 289-242. Not bad. Surprisingly, no team in C-USA differed from its Pythagorean W-L by more than three games. Two teams over-performed by three games (Houston and UCF) nobody under-performed by more than two. So we’ll give UH and UCF lucky check marks, nobody an unlucky check mark, and move on.
Idea #3 – Experience Matters
Let’s move away from sabermetrics for a second now. While impact newcomers show up every year without fail, experienced players are still generally better than inexperienced ones. So let’s take a look at which teams return the most in terms of players from a year ago. We’ll take a look at the offense and the pitching staff. Offensively, we’ll look at what percentage of a team’s 2009 at-bats accumulated return, and for pitching we’ll use innings pitched. While this is a pretty crude method (it won’t take into account things like talented players who were injured coming back [Rob Segedin of Tulane] or injured players who are on their team’s roster, but will miss at least part of the season [Jared Ray of Houston, Mike Ojala of Rice]) it should give us at least a basic idea of who has experienced players heading into the season.
Using this metric, Conference USA as a whole returns 63.3% of both its hitting and pitching from a year ago. No joke. So taking that as the baseline, let’s look at which teams return the most and least from last year. (Taking returning hitting and pitching and averaging the two.)
- Rice: 80%
- Houston: 77%
- Memphis: 77%
- UAB: 69%
- Marshall: 67%
- ECU: 65%
- Southern Miss: 54%
- Central Florida: 40%
- Tulane: 40%
Taking the teams that are significantly away from the mean means Rice, UH and Memphis have noticeable advantages in terms of experience returning, and Southern Miss, UCF and Tulane have noticeable disadvantages.
So with these factors in mind, let’s take a look at last year’s team records (sorted by overall winning percentage), with the factors we’ve just looked at noted parenthetically:
Rice: 43-18, 71% (Good: Most experience. Bad: Possibly lucky at both hitting and pitching.)
ECU: 46-20, 70% (Bad: Possibly lucky at hitting)
Southern Miss: 40-26, 61% (Bad: Not much experience.)
Tulane: 34-25, 58% (Bad: Tied least experience)
UAB: 31-26, 54% (None)
Houston: 27-31, 47% (Good: Experienced, unlucky pitching. Bad: Slightly out-performed pythag.)
Marshall: 22-32, 41% (None)
Memphis: 21-32, 40% (Good: Experienced)
Central Florida: 22-35, 39% (Good: Unlucky hitting and pitching. Bad: Slightly out-performed pythag., tied most inexperienced)
Looking at this, what would you think by way of a power ranking for C-USA for the upcoming season? Probably keep Rice-ECU at 1-2, given how much better they were than everybody else. With Rice having the experience edge, you’d keep them at #1, even though ECU won the C-USA regular season title by a game last year. Southern Miss is inexperienced, but not fatally so. So with a 6-win advantage over anybody else, you probably keep them where they are. Tulane, UAB and Houston are all pretty close, and the Cougars have the biggest pluses, so you move them to #4. You probably have UAB leap-frog Tulane, given the Green Wave’s lack of experience. Memphis takes the small step over Marshall due to experience, and UCF stays in last with a lack of experience, despite the possibility of Lady Luck turning their way in 2010. That would give you a 1-9 power ranking that looks like this:
- Rice
- ECU
- Southern Miss
- Houston
- UAB
- Tulane
- Memphis
- Marshall
- UCF
Switch UAB for Tulane, and Memphis for Marshall, and you have the exact order I listed the teams in for my pre-season power rankings. (Before I had looked up or taken any of this into account.) I’ll justify Tulane staying at the #5 spot due to the fact that Segedin is back (even though I didn’t mention that in my write-up…d’oh!) and did put up a .322/.414/.485 as a freshman in 2008. I wouldn’t argue with anybody moving the Blazers up to the 5-spot, however. I maintain Marshall at #7 over the Tigers, just because I like what returning talent they do have better than I like that of Memphis. So there.
The End.
If you actually read this entire convoluted mess of an article, show up to a Cougar baseball game, find me, and I will give you $20.
(Not really)