How Twelve’s algorithm understood the World Cup


It has been an amazing World Cup.

And for us at Twelve, it was more than ‘just’ the football. It was a chance for us to test the algorithm we have been developing for the past year.

The World Cup as a test set

Football is a difficult game to quantify. Indeed, many TV pundits and journalists claim that numbers can’t do justice when assessing player performance. While we accept that it isn’t easy, we don’t see why it isn’t possible. We believe that statistics give an extra edge in understanding football and, when used properly, can give just as much insight as a human expert.

The key to achieving insight is using the right numbers. And that is what we aim to do with Twelve. As data scientists would describe it, the World Cup has been a ‘test’ data set for us. We  fitted (or trained) our model on club football and now we have tested how the model fit on data from the World Cup.

It was a tough test, because we did it live via our rankings pages and our match app. The aim was to see how well the model rankings captured the assessments of our users and those  in the media.

Now the tournament is over I’ll explain where we have succeeded and where we have more work to do.

Why Luka Modric won the Golden Ball

First lets look at how the algorithm ranked the players in this World Cup. Below I run the Twelve algorithm on every player in every game throughout the tournament, and the winner is clear: Luka Modric.

World Cup

Modric emerged as top of our tournament leaderboard after the Quarter Final match against Russia. Despite receiving -750 points for a saved penalty, he dominated the game defensively and in attack.

A point that needs to be emphasised here is that Modric scored only once at the World Cup during open play (against Argentina) and provided only one assist (a corner against Russia). But these two particular numbers say very little about why he is so good. Instead, it was his work in midfield, that made him valuable. And it is this which the Twelve algorithm picks up on. The visualisation above shows how Modric won the midfield battle against Russia.

This is the key idea behind our algorithm. We have trained it on data from millions of passes, dribbles and defensive actions, and determined how well actions contribute to goal chances. So when it ‘sees’ Modric’s passes, the algorithm ‘knows’ that these are worth a lot of points. Similarly, the experts who judged the golden ball have seen lots of games of football and use their expertise to pick out the best player. In this case, the algorithm and the experts reach the same conclusion.

Looking at the other players in the top 10, we see a list that more or less reflects the narrative of the World Cup. Rakitic partners Modric as number two. Hazard (silver ball winner) and De Bruyne make the top 10 for Belgium. John Stones is the highest ranking defender, reflecting England’s success at the back for most of the tournament. Varane, Griezmann (bronze ball winner) and Pogba are 6th, 7th and 8th respectively, showing that France success was built on teamwork throughout defence, midfield and attack.

Eden Hazard and Toni Kroos win per minute played.

Toni Kroos has a lot to be disappointed over. Not only was his team knocked out at the group stages (coming bottom of their group), when we look per-minute played, we see that he tops our rankings.

A dashboard of Kroos three games shows that he put ball after ball in to the box, but to no avail.

Germany failed to score against Mexico and South Korea, and Kroos’s winning goal against Sweden was not enough to get them to the next stage.

A couple of other surprises in the top ten are Yerry Minna and Denis Cheryshev, both of them were on the pitch for just over 300 minutes and scored 3 and 5 goals, respectively. Otherwise, the top 10 accurately reflects the players who gained recognition for their performances. Eden Hazard is ranked highest out of players who got to the semi-final, and was awarded the silver ball as the tournaments second best player.

Further down the list we see Brazil’s Neymar, Spain’s Isco and Sweden’s Andreas Granqvist. Harry Kane is also represented, because of his seven goals (we awarded points for goals in penalty shootouts).

When revising these rankings, we do need to consider the quality of the opposition each player faced. I don’t claim that stats tell us everything. For us at Twelve, the important point is that both the total top 10 and the per-minute top 10 provide lists which more or less agree with the views of the pundits, journalists and experts who saw the game. In Brazil they write about Neymar, in Spain they write about Isco and in Sweden they write about Granqvist. The gold, silver and bronze ball winners top our lists, and the golden boot winner, Kane, also features.

Again, the point to emphasise is that the World Cup is our test set. While we eliminated a few bugs in our rankings, we essentially kept the same system as we have been using throughout the previous Premier League season and ran it on the World Cup. And the ranking make sense.

Why England lose

One of the major advantages of the Twelve approach is that the visualisations allow us to understand why players and teams have performed well or badly. This is what our writers have done throughout the tournament. Some of the highlights have been: Gustavo Fogaca’s look at France’s counterattacks ; Marvio dos Anjos study of the luck which carried Belgium past Brazil; Ahmad Yousef description of Egypts humiliation by Saudi Arabia; Milos Markovic telling of the end of the Russian fairytale; Roy Nemer’s characterisation of Messi as a false striker; Solomon Fowowe’s celebration of Musa in Nigeria’s win over Iceland and Nathan Clark’s look at Pochettino’s influence on England. That is not to mention Andrew Beasley’s insightful daily round-ups and tweets about the games.

The two teams I followed most closely were England and Sweden. Very early in the tournament, Twelve identified Kieran Trippier and Harry Maguire as two of England’s most important players. Trippier was leading the England attack providing ball after ball in to the box.

Maguire was closing things down at the back and powering up the middle to provide attacking passes and, of course, danger at set pieces.

Where England were weak, however, was in midfield. In the attack rating, which covers passes and dribbles, none of England’s midfielders made the top 5 for their team.

Dele Alli didn’t even make the top 10, nor did Raheem Sterling. While the four other semi-finalists had one or more strong and creative player in midfield, England were lacking a De Bruyne, a Kante/Pogba or a Modric/Rakitic. In his pre-match review, Ivan Zezelj was spot on in his analysis when he said it was in midfield that the semi-final could be won by Croatia. It wasn’t until after the loss that we saw mainstream journalists start to pick apart England’s midfield.

Maybe England suffered by not having Oxlade-Chamberlain available? In any case, it wasn’t to be for England this time around, and it is here that they will be looking to improve before the European Championships.

The Kante effect

One of the problems seen in many other statistical models of on-the-ball data is that they fail to give enough credit to defensive midfielders. Midfield players like N’Golo Kante are highly valued, not only for the interceptions they make, but also for the way they stress the opposition. But, up to now, there is no statistic which measures why they are successful.

We have worked hard to address this problem and think we have found a (partial) solution. Our off-the-ball metric gives points to players based on where the opposition loses the ball. It only works in the middle-third of the pitch, but it measures how successful a team are at pressing the opposition there.

And it was nice to see that N’Golo Kante topped this measure for most of the tournament.

If you press on ‘all actions’ you will see how Kante was awarded points as the opposition lost the ball. I write ‘most of the tournament’ because it was only in the last match, the final, that Kante lost the midfield battle and was replaced on the top of our rankings by Domagoj Vida.

The algorithm by no means fully captures positional play, but it does give Defensive Midfielders some well deserved credit.

How does Twelve compare to other statistics?

We have simplified our measurement of performance to four basic statistics: attack, defence, off-the-ball and shots/goals. This is a lot fewer numbers than we sometimes see used in assessing players. Across social media and in TV coverage we see stats for, to name just a few, successful dribbles, passes completed, shot conversion, expected goals, expected assists, through balls, interceptions and so on. These are often presented in the form of a ‘player radar’ that summarises the stats in comparison to other players or as a tweet listing ‘key statistics’.

What our ranking does is take every one of these individual statistics—every one of the numbers is incorporated in our model—and assign value to them based on where they occurred on the pitch and how they typically contribute to a team’s possession. Completing passes is one thing, but these passes have to create chances or move the ball forward in order to give Twelve attack points. Likewise, clearing the ball from the box is (usually) more important than winning it back in a crowded midfield.

To create our algorithm we looked at hundreds of thousands of possession chains—sequences of play where a team holds possession—and then fit a statistical model to predict the probability that a particular action will produce a shot. The plot below shows some of the inner workings of our model, plotting all the possession chains in the World Cup final that resulted in a shot. Croatia in red, France in green, with both teams shooting left to right.

This means that far from being arbitrary numbers, our rankings really do reflect, based on historical data, how well the actions we see on the pitch translate in to long term success of a team.

We aren’t claiming that the Twelve rankings are perfect. They do not include movement of players off the ball, but they do, given the on-the-ball data typically used in football analytics, provide one of the most accurate reflections possible of the qualities of a good football player. The same cannot be said for other more ad-hoc statistics. Our rankings tell you why a player is good, and if you don’t agree with us, you can just click on each action and see how we have ranked it.

What next?

There are a few things that we haven’t got quite right. Goalkeepers are proving a headache. We have assigned them points based on the expected goals of the shots they faced. But they simply don’t make enough saves for these points to add up to the same level as a good attacking player. Possibly we take away too many points when they let in a goal. This is something we are going to look at again before the Premier League starts.

And that brings us to my final point. The World Cup has gone really well for Twelve: articles written by journalists and fans, amazing editorial guidance by Andrew Beasley, nice looking graphics and player pictures, good traffic numbers with users coming back to us to get updates, player rankings that have been reliable and interesting discussions on Twitter. So, on this basis, we will be continuing in to the Premier League season in the same style.

The Twelve app (download now for Apple or Android) will be improved for the Premier League start: we will incorporate fan love from players selected in the app in to our analytics; we will continue to provide free rankings at; and we will continue to provide online articles written by fans. We will remain, for the foreseeable future, completely ad free and free of charge.

Stay with Twelve if you want to most sophisticated player evaluation in football!

One thought to “How Twelve’s algorithm understood the World Cup”

Leave a Reply

Your email address will not be published. Required fields are marked *