A few months ago I published a quite popular post on Clustering the Bible… one well known *clustering* algorithm is *k-means*. If you want to learn how *k*-means works and how to apply it in a real-world example, read on…

Continue reading “Learning Data Science: Understanding and Using *k*-means Clustering”

# Category: Learning R

Posts about learning R

## Learning Data Science: Understanding and Using

## From Coin Tosses to p-Hacking: Make Statistics Significant Again!

One of the most notoriously difficult subjects in statistics is the concept of *statistical tests*. We will explain the ideas behind it step by step to give you some intuition on how to use (and misuse) it, so read on…

Continue reading “From Coin Tosses to p-Hacking: Make Statistics Significant Again!”

## Learning R: Permutations and Combinations with Base R

The area of *combinatorics*, the art of systematic counting, is dreaded territory for many people, so let us bring some light into the matter: in this post we will explain the difference between *permutations* and *combinations*, with and without *repetitions*, will calculate the number of possibilities and present efficient R code to enumerate all of them, so read on…

Continue reading “Learning R: Permutations and Combinations with Base R”

## Learning R: Painting with Fire

A few months ago I published a post on *recursion*: To understand Recursion you have to understand Recursionâ€¦. In this post we will see how to use recursion to fill free areas of an image with colour, the caveats of recursion and how to transform a recursive algorithm into a loop-based version using a *queue* – so read on…

Continue reading “Learning R: Painting with Fire”

## Learning R: The Ultimate Introduction (incl. Machine Learning!)

There are a million reasons to learn R (see e.g. Why R for Data Science â€“ and not Python?), but where to start? I present to you the ultimate introduction to bring you up to speed! So read on…

Continue reading “Learning R: The Ultimate Introduction (incl. Machine Learning!)”

## Was the Bavarian *Abitur* too hard this time?

Bavaria is known for its famous Oktoberfest… and within Germany also for its presumably difficult *Abitur,* a qualification granted by university-preparatory schools in Germany.

A mandatory part for all students is maths. This year many students protested that the maths part was way too hard, they even started an online petition with more than seventy thousand supporters at this time of writing!

It is not clear yet whether their marks will be adjusted upwards, the ministry of education is investigating the case. As a professor in Bavaria who also teaches statistics I will take the opportunity to share with you an actual question from the original examination with solution, so read on…

Continue reading “Was the Bavarian *Abitur* too hard this time?”

## Separating the Signal from the Noise: Robust Statistics for Pedestrians

One of the problems of navigating an autonomous car through a city is to extract *robust signals* in the face of all the *noise* that is present in the different sensors. Just taking something like an arithmetic mean of all the data points could possibly end in a catastrophe: if a part of a wall looks similar to the street and the algorithm calculates an average trajectory of the two this would end in leaving the road and possibly crashing into pedestrians. So we need some robust algorithm to get rid of the noise. The area of statistics that especially deals with such problems is called *robust statistics* and the methods used therein *robust estimation*.

Continue reading “Separating the Signal from the Noise: Robust Statistics for Pedestrians”

## Learning Data Science: Predicting Income Brackets

As promised in the post Learning Data Science: Modelling Basics we will now go a step further and try to predict income brackets with real world data and different modelling approaches. We will learn a thing or two along the way, e.g. about the so-called *Accuracy-Interpretability Trade-Off*, so read on…

Continue reading “Learning Data Science: Predicting Income Brackets”

## Learning R: The Collatz Conjecture

In this post we will see that a little bit of simple R code can go a very long way! So let’s get started!

Continue reading “Learning R: The Collatz Conjecture”

## To understand Recursion you have to understand Recursion…

*Sorting* values is one of the bread and butter tasks in computer science: this post uses it as a use case to learn what *recursion* is all about. It starts with some nerd humour… and ends with some more, so read on!

Continue reading “To understand Recursion you have to understand Recursion…”