Analysing and predicting Schlag den Raab

16 12 2012

I’m a big fan of Schlag den Raab TV show. Last night we’ve got a new record of 3.5 million euro prize for the challenger. There’s been 38 episodes of it so far, I think it’s ripe for some statistics.. Some data is available from Wikipedia as a wiki table, from which I did some munging and cleaning with Google Refine, and export it to a CSV file. It’s interesting to observe the winners statistics, e.g. from which professional groups, gender, age, just from this raw data. For example, there has been no female winner since the beginning of the show, or 6 out of the 13 winners are 30 years or older, etc.

I will continue collecting more data as the show continues, perhaps devising some prediction methods/models along the way, probably the-good-old-but-reliable support vector machines or some sort. Just not to remove the fun from the show, it’s not going to be a full prediction with “cutoff” before an episode. I’m more thinking of a prediction that is continuously updated along an episode as more and more information are unveiled, and thus gaining more and more certainty.

For example, the candidate’s occupation could have some influence, and it can only be known about 30 minutes into the episode. Some challenges are always played in one form or another in every episode (e.g. “Blamieren oder Kassieren”, challenges involving car driving skills, certain types of sports, etc.). It would be interesting to get also some statistics on this, e.g. Raab almost always wins Blamieren oder Kassieren. Some data is available from this site, but it’s not complete. Results from earlier episodes are missing.

These challenges can also be grouped into a set of played challenges in an episode, which can be used as one “feature” of the prediction model. E.g. if an episode contains challenges in which Raab is extremely good at, then it is very likely that he will win the episode. Again, a full knowledge of this set is not available before the episode, so the prediction would have to be updated as the episode progresses. It might be possible to set a cutoff at sometime in the episode, e.g. once the certainty level exceeds some percentage that either Raab or the candidate will win.

Let’s see.. šŸ™‚




