New Bundesliga Forecasting Tool: Can Underdog Hertha Berlin beat Bayern Munich?


The Bundesliga is Germany’s primary football league. It is one of the most important football leagues in the world, broadcast on television in over 200 countries.

If you want to get your hands on a tool to forecast the result of any game (and perform some more statistical analyses), read on!
Continue reading “New Bundesliga Forecasting Tool: Can Underdog Hertha Berlin beat Bayern Munich?”

The “Youth Bulge” of Afghanistan: The Hidden Force behind Political Instability


In view of the current dramatic events in Afghanistan many wonder why the extensive international efforts to bring some stability to the country have failed so miserably.

In this post, we will present and analytically examine a fascinating theory that seems to be able to explain political (in-)stability almost mono-causally, so read on!
Continue reading “The “Youth Bulge” of Afghanistan: The Hidden Force behind Political Instability”

Learning Path for “Data Science with R” – Part I


Over the course of the last two and a half years, I have written over one hundred posts for my blog “Learning Machines” on the topics of data science, i.e. statistics, artificial intelligence, machine learning, and deep learning.

I use many of those in my university classes and in this post, I will give you the first part of a learning path for the knowledge that has accumulated on this blog over the years to become a well-rounded data scientist, so read on!
Continue reading “Learning Path for “Data Science with R” – Part I”

The Small Data Rule: Infer the Big Picture from only Five Values!


Everybody is talking about big data but the real skill lies in the art of inferring useful information from only a handful of values!

If you want to learn how to determine the range of the typical value of a dataset (i.e. the median) with just five values and why this works, read on!
Continue reading “The Small Data Rule: Infer the Big Picture from only Five Values!”

Euro 2020: Will Switzerland kick out Spain too?


One of the big sensations of the UEFA Euro 2020 is that Switzerland kicked out world champion France. We take this as an opportunity to share with you a simple statistical model to predict football (soccer) results with R, so read on!
Continue reading “Euro 2020: Will Switzerland kick out Spain too?”

Financial X-Rays: Dissect any Price Series with a simple Payoff Diagram


Not many people understand the financial alchemy of modern financial investment vehicles, like hedge funds, that often use sophisticated trading strategies. But everybody understands the meaning of rising and falling markets. Why not simply translate one into the other?

If you want to get your hands on a simple R script that creates an easy-to-understand plot (a profit & loss profile or payoff diagram) out of any price series, read on!
Continue reading “Financial X-Rays: Dissect any Price Series with a simple Payoff Diagram”

Fame: Is Becoming a Star Written in the Stars?


I sometimes joke that as an Aries I don’t believe in zodiac signs. But could there still be some pattern, e.g. in the sense that people born in spring are more prone to success than those born during the winter months?

In this post, we will provide a definitive answer with one of the most fascinating datasets I have ever encountered, so read on!
Continue reading “Fame: Is Becoming a Star Written in the Stars?”

Learning Statistics: On Hot, Cool, and Large Numbers


My father-in-law used to write down the numbers drawn on the lottery to find patterns, especially whether some numbers were “due” because they hadn’t been drawn for a long time. He is not alone! And don’t they have a point? Shouldn’t the numbers balance after some time? Read on to find out!
Continue reading “Learning Statistics: On Hot, Cool, and Large Numbers”

The Solution to my Viral Coin Tossing Poll

Some time ago I conducted a poll on LinkedIn that quickly went viral. I asked which of three different coin tossing sequences were more likely and I received exactly 1,592 votes! Nearly 48,000 people viewed it and more than 80 comments are under the post (you need a LinkedIn account to fully see it here: LinkedIn Coin Tossing Poll).

In this post I will give the solution with some background explanation, so read on!
Continue reading “The Solution to my Viral Coin Tossing Poll”

Recidivism: Identifying the Most Important Predictors for Re-offending with OneR


In 2018 the renowned scientific journal science broke a story that researchers had re-engineered the commercial criminal risk assessment software COMPAS with a simple logistic regression (Science: The accuracy, fairness, and limits of predicting recidivism).

According to this article, COMPAS uses 137 features, the authors just used two. In this post, I will up the ante by showing you how to achieve similar results using just one simple rule based on only one feature which is found automatically in no-time by the OneR package, so read on!
Continue reading “Recidivism: Identifying the Most Important Predictors for Re-offending with OneR”