Learning R: The Ultimate Introduction (incl. Machine Learning!)


There are a million reasons to learn R (see e.g. Why R for Data Science – and not Python?), but where to start? I present to you the ultimate introduction to bring you up to speed! So read on…
Continue reading “Learning R: The Ultimate Introduction (incl. Machine Learning!)”

Was the Bavarian Abitur too hard this time?


Bavaria is known for its famous Oktoberfest… and within Germany also for its presumably difficult Abitur, a qualification granted by university-preparatory schools in Germany.

A mandatory part for all students is maths. This year many students protested that the maths part was way too hard, they even started an online petition with more than seventy thousand supporters at this time of writing!

It is not clear yet whether their marks will be adjusted upwards, the ministry of education is investigating the case. As a professor in Bavaria who also teaches statistics I will take the opportunity to share with you an actual question from the original examination with solution, so read on…
Continue reading “Was the Bavarian Abitur too hard this time?”

Separating the Signal from the Noise: Robust Statistics for Pedestrians


One of the problems of navigating an autonomous car through a city is to extract robust signals in the face of all the noise that is present in the different sensors. Just taking something like an arithmetic mean of all the data points could possibly end in a catastrophe: if a part of a wall looks similar to the street and the algorithm calculates an average trajectory of the two this would end in leaving the road and possibly crashing into pedestrians. So we need some robust algorithm to get rid of the noise. The area of statistics that especially deals with such problems is called robust statistics and the methods used therein robust estimation.
Continue reading “Separating the Signal from the Noise: Robust Statistics for Pedestrians”

Learning Data Science: Predicting Income Brackets


As promised in the post Learning Data Science: Modelling Basics we will now go a step further and try to predict income brackets with real world data and different modelling approaches. We will learn a thing or two along the way, e.g. about the so-called Accuracy-Interpretability Trade-Off, so read on…
Continue reading “Learning Data Science: Predicting Income Brackets”

To understand Recursion you have to understand Recursion…


Sorting values is one of the bread and butter tasks in computer science: this post uses it as a use case to learn what recursion is all about. It starts with some nerd humour… and ends with some more, so read on!
Continue reading “To understand Recursion you have to understand Recursion…”

So, what is AI really?


One of the topics that is totally hyped at the moment is obviously Artificial Intelligence or AI for short. There are many self-proclaimed experts running around trying to sell you the stuff they have been doing all along under this new label.

When you ask them what AI means you will normally get some convoluted explanations (which is a good sign that they don’t get it themselves) and some “success stories”. The truth is that many of those talking heads don’t really know what they are talking about, yet happen to have a friend who knows somebody who picked up a book at the local station bookshop… ok, that was nasty but unfortunately often not too far away from the truth.

So, what is AI really? This post tries to give some guidance, so read on!
Continue reading “So, what is AI really?

Hash Me If You Can

We are living in the era of Big Data but the problem of course is that the bigger our data sets become the slower even simple search operations get. I will now show you a trick that is the next best thing to magic: building a search function that practically doesn’t slow down even for large data sets… in base R!
Continue reading “Hash Me If You Can”

Learning R: A gentle introduction to higher-order functions


Have you ever thought about why the definition of a function in R is different from many other programming languages? The part that causes the biggest difficulties (especially for beginners of R) is that you state the name of the function at the beginning and use the assignment operator – as if functions were like any other data type, like vectors, matrices or data frames…
Continue reading “Learning R: A gentle introduction to higher-order functions”