The Central Limit Theorem (CLT): From Perfect Symmetry to the Normal Distribution


How can the Normal Distribution arise out of a completely symmetric set-up? The so-called Central Limit Theorem (CLT) is a fascinating example that demonstrates such behaviour. If you want to get some intuition on what lies at the core of many statistical tests, read on!
Continue reading “The Central Limit Theorem (CLT): From Perfect Symmetry to the Normal Distribution”

“You Are Here”: Understanding How GPS Works


Last week, I showed you a method of how to find the fastest path from A to B: Finding the Shortest Path with Dijkstra’s Algorithm. To make use of that, we need a method to determine our position at any point in time.

For that matter, many devices use the so-called Global Positioning System (GPS). If you want to understand how it works and do some simple calculations in R, read on!
Continue reading ““You Are Here”: Understanding How GPS Works”

Finding the Shortest Path with Dijkstra’s Algorithm


I have to make a confession: when it comes to my sense of orientation I am a total failure… sometimes it feels like GPS and Google maps were actually invented for me!

Well, nowadays anybody uses those practical little helpers. But how do they actually manage to find the shortest path from A to B?

If you want to understand the father of all routing algorithms, Dijkstra’s algorithm, and want to know how to program it in R read on!
Continue reading “Finding the Shortest Path with Dijkstra’s Algorithm”

3.84, or How to Detect BS (Fast)

In From Coin Tosses to p-Hacking: Make Statistics Significant Again! I explained the general principles behind statistical testing, here I will give you a simple method that you could use for quick calculations to check whether something fishy is going on (i.e. a fast statistical BS detector), so read on!
Continue reading “3.84, or How to Detect BS (Fast)”

Network Analysis: Who is the Most Important Influencer?

Networks are everywhere: traffic infrastructure and the internet come to mind, but networks are also in nature: food chains, protein-interaction networks, genetic interaction networks and of course neural networks which are being modelled by Artificial Neural Networks.

In this post, we will create a small network (also called graph mathematically) and ask some question about which is the “most important” node (also called vertex, pl. vertices). If you want to understand important concepts of network centrality and how to calculate those in R, read on!
Continue reading “Network Analysis: Who is the Most Important Influencer?”

Local Differential Privacy: Getting Honest Answers on Embarrassing Questions

Do you cheat on your partner? Do you take drugs? Are you gay? Are you an atheist? Did you have an abortion? Will you vote for the right-wing candidate? Not all people feel comfortable answering those kinds of questions in every situation honestly.

So, is there a method to find the respective proportion of people without putting them on the spot? Actually, there is! If you want to learn about randomized response (and how to create flowcharts in R along the way) read on!
Continue reading “Local Differential Privacy: Getting Honest Answers on Embarrassing Questions”

Doing Maths Symbolically: R as a Computer Algebra System (CAS)


When I first saw the Computer Algebra System Mathematica in the nineties I was instantly fascinated by it: you could not just calculate things with it but solve equations, simplify, differentiate and integrate expressions and even solve simple differential equations… not just numerically but symbolically! It helped me a lot during my studies at the university back then. Normally you cannot do this kind of stuff with R but fear not, there is, of course, a package for that!
Continue reading “Doing Maths Symbolically: R as a Computer Algebra System (CAS)”

Kalman Filter as a Form of Bayesian Updating


The Kalman filter is a very powerful algorithm to optimally include uncertain information from a dynamically changing system to come up with the best educated guess about the current state of the system. Applications include (car) navigation and stock forecasting. If you want to understand how a Kalman filter works and build a toy example in R, read on!
Continue reading “Kalman Filter as a Form of Bayesian Updating”

Time Series Analysis: Forecasting Sales Data with Autoregressive (AR) Models


Forecasting the future has always been one of man’s biggest desires and many approaches have been tried over the centuries. In this post we will look at a simple statistical method for time series analysis, called AR for Autoregressive Model. We will use this method to predict future sales data and will rebuild it to get a deeper understanding of how this method works, so read on!
Continue reading “Time Series Analysis: Forecasting Sales Data with Autoregressive (AR) Models”

COVID-19: False Positive Alarm

In this post, we are going to replicate an analysis from the current issue of Scientific American about a common mathematical pitfall of Coronavirus antibody tests with R.

Many people think that when they get a positive result of such a test they are immune to the virus with high probability. If you want to find out why nothing could be further from the truth, read on!
Continue reading “COVID-19: False Positive Alarm”