Inverse Statistics – and how to create Gain-Loss Asymmetry plots in R


Asset returns have certain statistical properties, also called stylized facts. Important ones are:

  • Absence of autocorrelation: basically the direction of the return of one day doesn’t tell you anything useful about the direction of the next day.
  • Fat tails: returns are not normal, i.e. there are many more extreme events than there would be if returns were normal.
  • Volatility clustering: basically financial markets exhibit high-volatility and low-volatility regimes.
  • Leverage effect: high-volatility regimes tend to coincide with falling prices and vice versa.

Continue reading “Inverse Statistics – and how to create Gain-Loss Asymmetry plots in R”

Learning Data Science: Predicting Income Brackets


As promised in the post Learning Data Science: Modelling Basics we will now go a step further and try to predict income brackets with real world data and different modelling approaches. We will learn a thing or two along the way, e.g. about the so-called Accuracy-Interpretability Trade-Off, so read on…
Continue reading “Learning Data Science: Predicting Income Brackets”

Customers who bought…


One of the classic examples in data science (called data mining at the time) is the beer and diapers example: when a big supermarket chain started analyzing their sales data they encountered not only trivial patterns, like toothbrushes and toothpaste being bought together but also quite strange combinations like beer and diapers. Now, the trivial ones are reassuring that the method works but what about the more extravagant ones? Does it mean that young parents are alcoholics? Or that instead of breastfeeding they give their babies beer? Obviously, they had to get to the bottom of this.
Continue reading “Customers who bought…”

To understand Recursion you have to understand Recursion…


Sorting values is one of the bread and butter tasks in computer science: this post uses it as a use case to learn what recursion is all about. It starts with some nerd humour… and ends with some more, so read on!
Continue reading “To understand Recursion you have to understand Recursion…”

So, what is AI really?


One of the topics that is totally hyped at the moment is obviously Artificial Intelligence or AI for short. There are many self-proclaimed experts running around trying to sell you the stuff they have been doing all along under this new label.

When you ask them what AI means you will normally get some convoluted explanations (which is a good sign that they don’t get it themselves) and some “success stories”. The truth is that many of those talking heads don’t really know what they are talking about, yet happen to have a friend who knows somebody who picked up a book at the local station bookshop… ok, that was nasty but unfortunately often not too far away from the truth.

So, what is AI really? This post tries to give some guidance, so read on!
Continue reading “So, what is AI really?

Learning Data Science: Modelling Basics


Data Science is all about building good models, so let us start by building a very simple model: we want to predict monthly income from age (in a later post we will see that age is indeed a good predictor for income).
Continue reading “Learning Data Science: Modelling Basics”

Hash Me If You Can

We are living in the era of Big Data but the problem of course is that the bigger our data sets become the slower even simple search operations get. I will now show you a trick that is the next best thing to magic: building a search function that practically doesn’t slow down even for large data sets… in base R!
Continue reading “Hash Me If You Can”