A few month ago I posted about *market basket analysis* (see Customers who boughtâ€¦), in this post we will see another form of it, done with *Logistic Regression*, so read on…

Continue reading “Learning Data Science: The Supermarket knows you are pregnant before your Dad does”

# Category: Statistics

Posts about statistics

## Causation doesn’t imply Correlation *either*

You may have misread the title as the old *correlation does not imply causation* mantra, but the opposite is also true! If you don’t believe me, read on…

Continue reading “Causation doesn’t imply Correlation *either*“

## From Coin Tosses to p-Hacking: Make Statistics Significant Again!

One of the most notoriously difficult subjects in statistics is the concept of *statistical tests*. We will explain the ideas behind it step by step to give you some intuition on how to use (and misuse) it, so read on…

Continue reading “From Coin Tosses to p-Hacking: Make Statistics Significant Again!”

## Learning R: Permutations and Combinations with Base R

The area of *combinatorics*, the art of systematic counting, is dreaded territory for many people, so let us bring some light into the matter: in this post we will explain the difference between *permutations* and *combinations*, with and without *repetitions*, will calculate the number of possibilities and present efficient R code to enumerate all of them, so read on…

Continue reading “Learning R: Permutations and Combinations with Base R”

## Learning R: The Ultimate Introduction (incl. Machine Learning!)

There are a million reasons to learn R (see e.g. Why R for Data Science â€“ and not Python?), but where to start? I present to you the ultimate introduction to bring you up to speed! So read on…

Continue reading “Learning R: The Ultimate Introduction (incl. Machine Learning!)”

## Was the Bavarian *Abitur* too hard this time?

Bavaria is known for its famous Oktoberfest… and within Germany also for its presumably difficult *Abitur,* a qualification granted by university-preparatory schools in Germany.

A mandatory part for all students is maths. This year many students protested that the maths part was way too hard, they even started an online petition with more than seventy thousand supporters at this time of writing!

It is not clear yet whether their marks will be adjusted upwards, the ministry of education is investigating the case. As a professor in Bavaria who also teaches statistics I will take the opportunity to share with you an actual question from the original examination with solution, so read on…

Continue reading “Was the Bavarian *Abitur* too hard this time?”

## Backtest Trading Strategies Like a Real Quant

R is one of the best choices when it comes to *quantitative finance*. Here we will show you how to load financial data, plot *charts* and give you a step-by-step template to *backtest trading strategies*. So, read on…

Continue reading “Backtest Trading Strategies Like a Real Quant”

## The Rich didn’t earn their Wealth, they just got Lucky

Tomorrow, on the *First of May*, many countries celebrate the so called *International Workers’ Day* (or *Labour Day*): time to talk about the *unequal distribution of wealth* again, so read on!

Continue reading “The Rich didn’t earn their Wealth, they just got Lucky”

## Base Rate Fallacy – or why No One is justified to believe that Jesus rose

In this post we are talking about one of the most unintuitive results in statistics: the so called *false positive paradox* which is an example of the so called *base rate fallacy*. It describes a situation where a positive test result of a very sensitive medical test shows that you have the respective disease… yet you are most probably healthy!

Continue reading “Base Rate Fallacy – or why No One is justified to believe that Jesus rose”

## Separating the Signal from the Noise: Robust Statistics for Pedestrians

One of the problems of navigating an autonomous car through a city is to extract *robust signals* in the face of all the *noise* that is present in the different sensors. Just taking something like an arithmetic mean of all the data points could possibly end in a catastrophe: if a part of a wall looks similar to the street and the algorithm calculates an average trajectory of the two this would end in leaving the road and possibly crashing into pedestrians. So we need some robust algorithm to get rid of the noise. The area of statistics that especially deals with such problems is called *robust statistics* and the methods used therein *robust estimation*.

Continue reading “Separating the Signal from the Noise: Robust Statistics for Pedestrians”