# When data drives sports

### The advent of big data technologies and data science now makes it possible, and necessary, for every team to analyze their opponent’s every weakness

As rugby teams from the southern hemisphere tour Europe for a series of test matches -- the last before next year's World Cup in England -- we get delighted by high quality games. And, at the same time, we get fed by a deluge of statistics, displayed on our TV screens and commented by broadcasters. Nothing really new, of course, especially for ones who are used to watching sports on TV in the US.

Born and raised in France, I only discovered baseball a grown man, when I relocated to Boston (and yes, that made me a die-hard Red Sox fan). At first, I couldn't understand how the game was played, and all I could notice was that it seemed like a very number-oriented game with all these statistics and percentages shown on screen. Then one evening in a sports bar during a business trip, a colleague of mine was patient enough to explain to a novice the rules of the game. Shortly thereafter I was offered a copy of Michael Lewis' Moneyball (the book, not the movie), and dived with interest into the statistics of the game. And concluded that baseball is indeed a mathematical game -- or rather a statistical and analytical game.

There are several angles in the analytics of sports. The first one is, of course, player selection. In Moneyball for example, Oakland's Billy Beane selects players for hidden qualities that get revealed by a deep analysis of their past plays and performance.

But analytics also bring tremendous value to a player or coach planning strategy for a game. Clearly, players don't need statistics to tell them that in baseball (but also in tennis), you don't pitch (or serve) the same way when playing against a right or left-handed opponent. But statistics come in very handy to help position the ball in the exact spot where the opponent has proven historically that he is the weakest, even if that weakness is only by a few percentage points -- after hundreds of balls, all one needs to win is to be one point ahead.

During the game, numbers crunched in the back office and transmitted to the coach on the field, will confirm his own observations and help identify player's fatigue and (when possible) let them decide on a switch before a mistake happens. They can also recognize unexpected weaknesses in the opposing team that can be taken advantage of.

None of these analytics would be possible without comprehensive and reliable data collection. Sports leagues, and teams themselves, have invested tremendous resources in collecting and storing every metric they can think of. In the early 20th century, well before computers could be used to store and crunch this data, baseball cards would already show key statistics about players. The advent of storage resources, and more recently, of more automated ways of collecting data (such as overhead cameras filming a tennis court or instrumented tennis rackets), has rendered the possibilities almost limitless.

In Moneyball, Billy Beane uses mostly printouts and spreadsheets to analyze player statistics. Today, every major league/national sports team invests heavily in IT resources, in big data technologies, in data science ... oftentimes helped by their preferred software vendor. For example, earlier this year, SAP was boasting about how it would help the German national soccer team win the FIFA World Cup.

And guess what? Germany won.

``` ```