Shortly before Brazil's crushing 7-1 defeat to Germany in 2014's World Cup, Goldman Sachs econometric model forecast the South American team as by far the strongest favorite to lift football's soccer's greatest trophy. So, with that in mind, Jan Hatzius and his hooligans unleash their predictions for the 2016 European Football Championship that just got under way. Using historical performance data for each team – most importantly the Elo rating system originally devised to rank chess players – they estimate a set of probabilities that a particular team will reach a particular round, up to and including the championship… concluding that France (the hosts) are the most likely to win. Bet accordingly.
- The model says that France has a 23% probability of winning the trophy, followed by Germany at 20%, Spain at 14%, and England at 11%. Although Germany has the highest Elo rating, France is slightly favored because of its home advantage. After each day of play, we will re-run the model using updated historical performance data in order to generate new probabilities and a new modal forecast.
- How much faith should we have in these predictions? On the plus side, our approach carefully considers the stochastic nature of the tournament using statistical methods; also, the predictions are not far from bookmakers’ odds. On the minus side, the environment is “stochastic” indeed, i.e., football is quite an unpredictable game!
- That charming unpredictability was on full display two years ago, when our model failed to anticipate the elimination of heavyweights Spain and Italy in the group stage and gave Brazil a 48% probability of winning the trophy. More encouragingly, it identified three of the four semifinalists before the start of the tournament, and the fully updated version predicted the winner of every match in the knockout stage except for the 7-1 semifinal between Germany and Brazil.
Today we introduce our statistical model for predicting the outcome of the 2016 European Football Championship in France from June 10 to July 10.
How the Model Works
At a high level, our approach is as follows. First, we estimate a regression model to predict the number of goals scored by a particular team (“team i”) against a particular opponent (“team j”) using the entire history of mandatory international matches since 1958, when the first European championship was played (a total of 4,719 matches). Following the literature on predicting football matches, we assume that the number of goals scored by team i is described by a so-called Poisson distribution and explained by the following statistical factors:
- The difference in team performance as reflected in Elo ratings prior to the match. The Elo system was originally devised to rank chess players. It is a composite measure of national football team success that evolves depending on a team's results and the strength of its opponents.
- The number of goals scored by team i in the last 10 competitive matches.
- The number of goals conceded by team j in the last 2 competitive matches.
- A home dummy.
- A Euro Cup dummy to capture whether a team does systematically better at Euro Cups than in other competitive matches
Second, we use these regression estimates and our assumed Poisson distribution in a Monte Carlo simulation with 100,000 draws to generate a distribution of outcomes for each of the 52 matches, from the opener between France and Romania on June 10 to the final on July 10. We use the rounded prediction of the goals scored to determine the outcome of each match during the group stage and the unrounded prediction to pick the winner in the knockout stage.
Third, we use the estimation results to generate both a set of probabilities that a particular team reaches a particular stage of the tournament, up to and including the championship, and a modal—that is, single most likely—forecast for the outcome of each match, which we then run forward through the tournament until the final.
A Summary of the Predictions
Our probabilities are shown in Exhibit 1. The model says that France has a 23% probability of winning the trophy, followed by Germany at 20%, Spain at 14%, and England at 11%. Although Germany has the highest Elo rating, France is favored because of its home advantage.
Exhibits 2 and 3 provide a different perspective by showing the modal prediction for the entire tournament. There are some interesting contrasts with the probabilities in Exhibit 1. For example, Exhibit 1 says that Germany is more likely than Spain to win the tournament because it is more likely to succeed across the entire range of possible tournament configurations.
But Exhibit 3 says that in the single most likely case, Spain beats England in Semifinal 1, France beats Germany in Semifinal 2, and France then wins the final—i.e., Germany finishes behind Spain.
Which approach is better, the probabilistic one in Exhibit 1 or the modal one in Exhibits 2 and 3? A modal forecast does have the advantage of being more “crisp.” The sentence “Goldman Sachs says France will win” has a better ring to it than “Goldman Sachs says France has a 23% probability of winning, with Germany close behind.” Nevertheless, we think that a probabilistic approach is more useful—for predicting the outcome of football tournaments and, increasingly, for our day-to-day work on economic forecasting.
Exhibit 4 provides more insight into the results by breaking down the probabilities of winning for the top four teams in a “waterfall chart” format. It shows that the most important factor is the Elo score, followed by home advantage and the Euro Cup dummy. The chart illustrates that the front-runner position for France derives largely from its home advantage, as its Elo rating is well below Germany’s and also a bit below Spain’s. Meanwhile, Germany benefits from the Euro Cup dummy, which picks up its historically strong tournament performance.
How Confident Can We Be?
It is difficult to assess how much faith one should have in these predictions. On the plus side, our approach carefully considers the stochastic nature of the tournament using statistical methods, and we do think that the Elo rating—the most important input into our analysis—is a compelling summary of a team’s track record. On the minus side, we ignore a number of potentially important factors that are difficult to summarize statistically, including the quality of the individual players unless they are reflected in the team’s recent track record.[4] And there is no room for human judgment (which may not be such a bad thing given that none of us are really football experts but some are enthusiastic Germany supporters).
One useful cross-check is to compare our results with bookmakers’ odds. Exhibit 5 plots our estimated championship probability against the average probability implied by the odds offered by five different bookmakers. The basic result is clear. Even though our model does not include bookmakers’ odds in any way, the probabilities are quite similar. A possible reason is that professional betting firms use many of the same inputs—such as Elo ratings—in their analysis and that they process the information in ways that are ultimately similar to ours.
* * *
May the best team win and let’s hope that watching Euro 2016 is as much fun as it was to write this article!
Source: Goldman Sachs
The post And The Winner Of Euro 2016 According To Goldman Is… appeared first on crude-oil.top.