Wednesday, April 3, 2013

Using my March Madness Formula to Predict the 2013 MLB Season: Part 1, An Introduction to the System




 by Cody Kay

Busy living the American Dream of searching for a job in this awesome economy, I have neglected my baseball blogging but my main man Bryce Harper and his two opening day home runs have inspired me to get back to work. It makes sense to start with season predictions so here we go. I am going to split this up into an introduction to how I am making my predictions this year, my AL predictions, and then my NL predictions. So if Bryce has hit 5 homeruns by the time the NL predictions come out and the Nationals are undefeated, I promise that--regardless of the date of publication of my posts--the predictions were not affected by the early season results.

I recently mused before the start of March Madness that I have consistently placed in the top 5-10 in bracket pools (11 out of 13 years) and, lo and behold, the tournament is down to the Final Four and I again leading the pool I am in with my bracket titled “The Fightin’ Blue Waffles” (99th percentile). I have a basic formula that usually works and can be broken into 4 basic steps.


1)      I identify the 2-3 strongest #1 seeds and put them into the Final Four and likely title game
·         These were Louisville and Indiana for me this year
2)      I identify at least one team seeded  #2-6 that if everything went right, they could easily beat any team in the tournament and subsequently put them in the final four
·         Michigan on multiple occasions looked like the best offensive team I watched all year; Ohio State had a bruising defense and a sure offensive scorer
·         These two categories comprise my Final Four picks. I typically choose a national champion from the first category but every now and then a team from this category is worth choosing. My most successful use of this is the Carmelo Anthony-led Syracuse team and then the Kemba Walker-led Connecticut team is a great recent example.
3)      Assuming that the matchup isn’t expected to be a near coin flip, I always pick the favorite unless there is at least one clear, strong rational reason to pick the upset
·         The two best examples from this year…
                                                                                i.            UCLA v Minnesota – I picked Minnesota because they were known to be strong rebounders while UCLA was a poor rebounding team who also had terrible chemistry issues
                                                                                ii.            UNLV v Cal – the game was taking place near Cal’s campus and home court is a big advantage in college basketball
4)      Finally, I always allow myself to use some quality unabashed bias as long as I can make a logical justification for it
·         I hate Duke and never pick them to upset a better ranked team. In the last 14 years, they have rarely made any upsets so this really hasn’t hurt me.
·         Additionally, I am biased toward the teams that finished near the top of the Big Ten and whatever I considered to be top conference that year. Again, this rarely hurts because these are going to be top seeds and upsets of top seeds rarely happen

At this point, I am left to pick between close matchups where I used what I call “gut check” picks that are a matter of luck and skill. This formula isn’t overly clever and the overriding and overlapping theme in all four steps is use logic/rational thinking.  It is extremely easy to follow as the first category requires picking 2-3 of the four #1 seeds and putting them into the Final Four which takes care of 50-75% of the picks where all the points are made. However, it does greatly hinge on being able to properly identify the teams in the first two categories. This year, my method got me two Final Four picks which, while not as useful as other years, still made me one of only two people in my 18 team bracket to do so.

Now there are a lot more things that go into picking a successful bracket than just this but this blog is about baseball and now is the time to use those same methods to make predictions for the MLB season. The plan is therefore to make sure at least 50-75% of my division winners are the favorite, similar to my first category of finding and picking the couple most dominant teams to do well. The rest of my playoff spots are filled with teams ranked near the top that flash upside that says they are worth the risk.

A quick preview of my future picks has me picking 4 out of the 6 favorites to win their division. The wildcard picks add one of the other favorites and three other teams that are preseason top 6 to win their respective pennant. Additionally, I am taking 50% of the favorites to win the MVP and Cy Young races but as I will discuss tomorrow in my AL preview, I can’t believe that one of my picks isn’t the favorite and I am fairly surprised that my other pick for AL award winner is the favorite.

For references to pre-season favorites, I am using:
·                     Bovada’s recently posted odds on likelihood to win the division 
·                     Fangraphs’ early season over/unders to discuss win expectations
·                     Oddsshark’s props for awards favorites 

Since the time of the Fangraphs article, some of the lines have changed due to things such as injuries (the under on 86.5 wins for the Yankees looks quite tasty in retrospect) but most have stayed more or less the same and should serve as a good baseline. For the very interested reader, you can scroll down that article and see that an insightful (and likely awesome/handsome) commenter named “cody k” made the comment “in order of confidence my 3 would be: Cubs over, Pirates under, D-backs over” so you can take that as a bonus prediction made back on February 15th.

To close my introduction, I will explain how my love of sabermetrics causes an overlapping combination of where my third category preference for logical rationales meets my fourth category preference for allowing for bias. To put it simply, I am biased toward statistical analysis.

Like I said, I am a big believer in the sabermetric analysis of baseball and whenever I am not going with a favorite, it will be because of the logical rationale that comes with sabermetric analysis. Too often the media tries to frame sabermetric analysis as contrarian but oftentimes a statistical analysis just confirms the common thought. This year is no different with Fangraphs pre-season WAR projections telling us that Detroit is expected to run away with the AL Central while Marlins and Astros fans are in for a long season, just like the betting odds given by Bovada tell us.

Since the betting lines are fairly in-line with most sabermetric projections and I am biased toward said projections, my baseball predictions for the 2013 season are going to look a lot like the guy that picks all “chalk” when it comes to filling out a bracket. Nevertheless, beyond simple WAR expectations, I will identify some sabermetric thoughts that I am willing to be more biased for than most:
1)      Teams that have players that are consistent
2)      Teams that don’t have a high degree of injury concerns
3)      Teams that have a strong mix of talent across all positions with decent depth, and
4)   Teams that have a majority of players in peak or pre-peak aged seasons, especially the players that have the highest expectations (My strongest bias); A typical player’s talent grows steadily from 20-25, peaks in MLB is 26-28, holds on for a light decline for a year or two, and then drops pretty steeply. An excellent introduction to this concept was written by the Hardball Times.

So there you have it; my attempt to explain/justify in a cool way that basically all of my predictions are going to be for the boring favorites to succeed with the exception of making the bold prediction for a certain 20 year-old baseball player from the Nationals winning NL MVP (which given the current betting odds, isn’t as bold as it should be).

No comments:

Post a Comment