COVID-19 in the US: Back-of-the-Envelope Calculation of Actual Infections and Future Deaths


One of the biggest problems of the COVID-19 pandemic is that there are no reliable numbers of infections. This fact renders many model projections next to useless.

If you want to get to know a simple method how to roughly estimate the real number of infections and expected deaths in the US, read on!
Continue reading “COVID-19 in the US: Back-of-the-Envelope Calculation of Actual Infections and Future Deaths”

Contagiousness of COVID-19 Part I: Improvements of Mathematical Fitting (Guest Post)


Learning Machines proudly presents a guest post by Martijn Weterings from the Food and Natural Products research group of the Institute of Life Technologies at the University of Applied Sciences of Western Switzerland in Sion.
Continue reading “Contagiousness of COVID-19 Part I: Improvements of Mathematical Fitting (Guest Post)”

Collider Bias, or Are Hot Babes Dim and Eggheads Ugly?


Correlation and its associated challenges don’t lose their fascination: most people know that correlation doesn’t imply causation, not many people know that the opposite is also true (see: Causation doesn’t imply Correlation either) and some know that correlation can just be random (so-called spurious correlation).

If you want to learn about a paradoxical effect nearly nobody is aware of, where correlation between two uncorrelated random variables is introduced just by sampling, read on!
Continue reading “Collider Bias, or Are Hot Babes Dim and Eggheads Ugly?”

COVID-19: The Case of Germany

It is such a beautiful day outside, lot’s of sunshine, spring at last… and we are now basically all grounded and sitting here, waiting to get sick.

So, why not a post from the new epicentre of the global COVID-19 pandemic, Central Europe, more exactly where I live: Germany?! Indeed, if you want to find out what the numbers tell us how things might develop here, read on!
Continue reading “COVID-19: The Case of Germany”

Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?


A new invisible enemy, only 30kb in size, has emerged and is on a killing spree around the world: 2019-nCoV, the Novel Coronavirus!

It has already killed more people than the SARS pandemic and its outbreak has been declared a Public Health Emergency of International Concern (PHEIC) by the World Health Organization (WHO).

If you want to learn how epidemiologists estimate how contagious a new virus is and how to do it in R read on!
Continue reading “Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?”

Business Case Analysis with R (Guest Post)


Learning Machines proudly presents a fascinating guest post by decision and risk analyst Robert D. Brown III with a great application of R in the business and especially startup-arena! I encourage you to visit his blog too: Thales’ Press. Have fun!
Continue reading “Business Case Analysis with R (Guest Post)”

Psst, don’t tell anybody: The World is getting more rational!


Happy New Year to all of you! 2020 is here and it seems that we are being overwhelmed by more and more irrationality, especially fake news and conspiracy theories.

In this post, I will give you some indication that this might actually not be the case (shock horror: good news alert!). We will be using Google Trends for that: If you want to know what Google Trends is, learn how to query it from within R and process the retrieved data, read on!
Continue reading “Psst, don’t tell anybody: The World is getting more rational!”

Learning Data Science: Sentiment Analysis with Naive Bayes


As we have already seen in former posts simple methods can be surprisingly successful in yielding good results (see e.g Learning Data Science: Predicting Income Brackets or Teach R to read handwritten Digits with just 4 Lines of Code).

If you want to learn how some simple mathematics, known as Naive Bayes, can help you find out the sentiment of texts (in this case movie reviews) read on!
Continue reading “Learning Data Science: Sentiment Analysis with Naive Bayes”

Cambridge Analytica: Microtargeting or How to catch voters with the LASSO


The two most disruptive political events of the last few years are undoubtedly the Brexit referendum to leave the European Union and the election of Donald Trump. Both are commonly associated with the political consulting firm Cambridge Analytica and a technique known as Microtargeting.

If you want to understand the data science behind the Cambridge Analytica/Facebook data scandal and Microtargeting (i.e. LASSO regression) by building a toy example in R read on!
Continue reading “Cambridge Analytica: Microtargeting or How to catch voters with the LASSO”

Learning Data Science: The Supermarket Knows You are Pregnant Before Your Dad does!


A few months ago, I posted about market basket analysis (see Customers who bought…), in this post we will see another form of it, done with Logistic Regression, so read on…
Continue reading “Learning Data Science: The Supermarket Knows You are Pregnant Before Your Dad does!”