Ok, I compiled the data from this year on all +180 or higher dogs. I made a few assumptions when compiling the data which I dont think are too far off. Since the site I used only listed the favorite line, I assumed a 10 cent line for all favs under 200, a 20 cent line for all favs in the 200-290 range, and a 30 cent line for all favs 300 or better. So, with that, here is what we got. My conclusions are at the bottom.
Overall, the big dogs were 64-127 (33.5%) with a net total of -1.05 units.
On the days where there were only two big dogs listed, I have broken it down into two cases.
Case 1: Given that the first game WINS, how often did the second game WIN also. This occured 2 times out of 10 (20%) for a net total of -4.40 units.
Case 2: Given that the first game LOSES, how often did the second game WIN. This occured an outstanding 12 times out of 25 (40%) for a net total of +13.oo units!!
For the days where exactly 3 big dogs were listed, again two cases.
Case 1: Given the first two games both LOSE, how often did the 3rd game WIN. 1 in 4 (25%) for a net total of -2.20 units.
Case 2: Given the first two games WIN, how often did the 3rd game win. 0 for 2 (0%) for a net -2.00 units.
Now for the more statistical analysis. I broke up the big dogs into 4 catagories based on odds. The categories (ranges) are 180-200,
205-220, 225-240, 245 and up. I need a minumum number of games in each range which is why the last range encompasses so much. There just werent that many big big dogs.
What I did with these categories was listed the record of the dogs in each category and compared that to what you would "expect" to see from that category. For example, on average, a dog in the 180-200 range is expected to win 34.4% of the time. So if there were 100 games that fell in that category, you would expect to see 34 or 35 wins.
From these categories, I applied a "goodness-of-fit" test to see if the data matched what was expected to happen. Here is what you get....
Category 1: 180-200 range. There were 136 games. You expect to see 47 winners, there were actually 53.
Category 2: 205-220 range. There were 25 games. You expect to see 8 winners, there were only 2.
Category 3: 225-240 range. There were 16 games. You expect to see 5 winners. There were 5 winners.
Category 4: 245-up range. There were 14 games. You expect to see 4 winners. There were 4 winners.
Applying the "goodness-of-fit" statistic to this data with 4 categories, the resulting statistic showed that the data fits my initial assumption that the lines are accurate. The data in the category 2 was a little abnormal, but not enough to throw off the entire set. That is to be expected. Not all data falls in the center of the curve.
So, my conclusions are thus.....
1. Vegas knows what they are doing when they set lines. I limited my experiment to the big dogs because there were entirely too too many games to include small and mid-range dogs. But I think you can fairly assume that if the big dog data fits the model, then the other games would fit as well, especially since there are more data points out there.
2. Given that Vegas sets pretty damn good lines in baseball, I think it makes it very difficult to find "value". It's out there, but it should be obvious from the above that it isnt easy to find when the whole data set fits what you expect to see.
3. Lastly, given all this, I think it makes the work of guys like Nick, Nolan, Raymond, and Fletcher even more impressive when you consider that Vegas is setting pretty good lines and these guys are still finding the outliers in the data - those points that stray from the norm, and finding them consistently since they seem to win consistently. You arent going to win money at baseball by guessing and not studying. I know there are others are win consistently, but these names just came to off the top of my head. I dont want to offend anyone that I may have exclude. You all know who the good cappers are.
I dont bet much baseball and now I know why. It's too hard.
Overall, the big dogs were 64-127 (33.5%) with a net total of -1.05 units.
On the days where there were only two big dogs listed, I have broken it down into two cases.
Case 1: Given that the first game WINS, how often did the second game WIN also. This occured 2 times out of 10 (20%) for a net total of -4.40 units.
Case 2: Given that the first game LOSES, how often did the second game WIN. This occured an outstanding 12 times out of 25 (40%) for a net total of +13.oo units!!
For the days where exactly 3 big dogs were listed, again two cases.
Case 1: Given the first two games both LOSE, how often did the 3rd game WIN. 1 in 4 (25%) for a net total of -2.20 units.
Case 2: Given the first two games WIN, how often did the 3rd game win. 0 for 2 (0%) for a net -2.00 units.
Now for the more statistical analysis. I broke up the big dogs into 4 catagories based on odds. The categories (ranges) are 180-200,
205-220, 225-240, 245 and up. I need a minumum number of games in each range which is why the last range encompasses so much. There just werent that many big big dogs.
What I did with these categories was listed the record of the dogs in each category and compared that to what you would "expect" to see from that category. For example, on average, a dog in the 180-200 range is expected to win 34.4% of the time. So if there were 100 games that fell in that category, you would expect to see 34 or 35 wins.
From these categories, I applied a "goodness-of-fit" test to see if the data matched what was expected to happen. Here is what you get....
Category 1: 180-200 range. There were 136 games. You expect to see 47 winners, there were actually 53.
Category 2: 205-220 range. There were 25 games. You expect to see 8 winners, there were only 2.
Category 3: 225-240 range. There were 16 games. You expect to see 5 winners. There were 5 winners.
Category 4: 245-up range. There were 14 games. You expect to see 4 winners. There were 4 winners.
Applying the "goodness-of-fit" statistic to this data with 4 categories, the resulting statistic showed that the data fits my initial assumption that the lines are accurate. The data in the category 2 was a little abnormal, but not enough to throw off the entire set. That is to be expected. Not all data falls in the center of the curve.
So, my conclusions are thus.....
1. Vegas knows what they are doing when they set lines. I limited my experiment to the big dogs because there were entirely too too many games to include small and mid-range dogs. But I think you can fairly assume that if the big dog data fits the model, then the other games would fit as well, especially since there are more data points out there.
2. Given that Vegas sets pretty damn good lines in baseball, I think it makes it very difficult to find "value". It's out there, but it should be obvious from the above that it isnt easy to find when the whole data set fits what you expect to see.
3. Lastly, given all this, I think it makes the work of guys like Nick, Nolan, Raymond, and Fletcher even more impressive when you consider that Vegas is setting pretty good lines and these guys are still finding the outliers in the data - those points that stray from the norm, and finding them consistently since they seem to win consistently. You arent going to win money at baseball by guessing and not studying. I know there are others are win consistently, but these names just came to off the top of my head. I dont want to offend anyone that I may have exclude. You all know who the good cappers are.
I dont bet much baseball and now I know why. It's too hard.