Collider Bias, or Are Hot Babes Dim and Eggheads Ugly?


Correlation and its associated challenges don’t lose their fascination: most people know that correlation doesn’t imply causation, not many people know that the opposite is also true (see: Causation doesn’t imply Correlation either) and some know that correlation can just be random (so-called spurious correlation).

If you want to learn about a paradoxical effect nearly nobody is aware of, where correlation between two uncorrelated random variables is introduced just by sampling, read on!
Continue reading “Collider Bias, or Are Hot Babes Dim and Eggheads Ugly?”

COVID-19: The Case of Germany

It is such a beautiful day outside, lot’s of sunshine, spring at last… and we are now basically all grounded and sitting here, waiting to get sick.

So, why not a post from the new epicentre of the global COVID-19 pandemic, Central Europe, more exactly where I live: Germany?! Indeed, if you want to find out what the numbers tell us how things might develop here, read on!
Continue reading “COVID-19: The Case of Germany”

The One Question you should ask your Partner before Marrying!


Valentine’s Day is around the corner and love is in the air… but, shock horror, nearly every second marriage ends in a divorce! Unfortunately, I can tell you first hand that this is an experience you’d rather not have. In this post, we see how data science, in the form of the OneR package and an interesting new data set, might potentially help you to avoid that tragedy… so read on!
Continue reading “The One Question you should ask your Partner before Marrying!”

Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?


A new invisible enemy, only 30kb in size, has emerged and is on a killing spree around the world: 2019-nCoV, the Novel Coronavirus!

It has already killed more people than the SARS pandemic and its outbreak has been declared a Public Health Emergency of International Concern (PHEIC) by the World Health Organization (WHO).

If you want to learn how epidemiologists estimate how contagious a new virus is and how to do it in R read on!
Continue reading “Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?”

Does Australia need More Fires (but the Right Kind)? A Multi-Agent Simulation


We have all watched with great horror the catastrophic fires in Australia. Over many years scientists have been studying simulations to understand the underlying dynamics better. They tell us, that “what Australia needs is more fires, but of the right kind”. What do they mean by that?

One such simulation of fire is based on Multi-Agent Systems (MAS), also called Agent-Based Modelling (ABM). An excellent piece of free software (and in fact the de facto standard) is NetLogo. Even better is that NetLogo can be fully controlled by R… and we will use this feature to learn some crucial lessons!

If you want to understand more about the dynamics of fire in particular and about some fascinating properties of dynamical systems in general via controlling NetLogo with R, read on!
Continue reading “Does Australia need More Fires (but the Right Kind)? A Multi-Agent Simulation”

Explainable AI (XAI)… Explained! or How to whiten any Black Box with LIME


We already covered the so-called Accuracy-Interpretability Trade-Off which states that oftentimes the more accurate the results of an AI are the harder it is to interpret how it arrived at its conclusions (see also: Learning Data Science: Predicting Income Brackets).

This is especially true for Neural Networks: while often delivering outstanding results, they are basically black boxes and notoriously hard to interpret (see also: Understanding the Magic of Neural Networks).

There is a new hot area of research to make black-box models interpretable, called Explainable Artificial Intelligence (XAI), if you want to gain some intuition on one such approach (called LIME), read on!
Continue reading “Explainable AI (XAI)… Explained! or How to whiten any Black Box with LIME”

Business Case Analysis with R (Guest Post)


Learning Machines proudly presents a fascinating guest post by decision and risk analyst Robert D. Brown III with a great application of R in the business and especially startup-arena! I encourage you to visit his blog too: Thales’ Press. Have fun!
Continue reading “Business Case Analysis with R (Guest Post)”

Psst, don’t tell anybody: The World is getting more rational!


Happy New Year to all of you! 2020 is here and it seems that we are being overwhelmed by more and more irrationality, especially fake news and conspiracy theories.

In this post, I will give you some indication that this might actually not be the case (shock horror: good news alert!). We will be using Google Trends for that: If you want to know what Google Trends is, learn how to query it from within R and process the retrieved data, read on!
Continue reading “Psst, don’t tell anybody: The World is getting more rational!”

Painting Santa with Letters


After my little rant (which went viral!) about the tidyverse from last week, we are going to do a little fun project in the 50’th 🙂 post of this blog: ASCII Art! If you want to have some fun by painting with letters (i.e. ASCII characters) in R and get to see a direct comparison of tidyverse and base R code, read on!
Continue reading “Painting Santa with Letters”

Why I don’t use the Tidyverse


There seems to be some revolution going on in the R sphere… people seem to be jumping at what is commonly known as the tidyverse, a collection of packages developed and maintained by the Chief Scientist of RStudio, Hadley Wickham.

In this post, I explain what the tidyverse is and why I resist using it, so read on!
Continue reading “Why I don’t use the Tidyverse”