Will I get my Money back? Credit Scoring with OneR


More and more decisions by banks on who gets a loan are being made by artificial intelligence. The terms being used are credit scoring and credit decisioning.

They base their decisions on models whether the customer will pay back the loan or will default, i.e. determine their creditworthiness. If you want to learn how to build such a model in R yourself (with the latest R ≥ 4.1.0 syntax as a bonus), read on!
Continue reading “Will I get my Money back? Credit Scoring with OneR”

Recidivism: Identifying the Most Important Predictors for Re-offending with OneR


In 2018 the renowned scientific journal science broke a story that researchers had re-engineered the commercial criminal risk assessment software COMPAS with a simple logistic regression (Science: The accuracy, fairness, and limits of predicting recidivism).

According to this article, COMPAS uses 137 features, the authors just used two. In this post, I will up the ante by showing you how to achieve similar results using just one simple rule based on only one feature which is found automatically in no-time by the OneR package, so read on!
Continue reading “Recidivism: Identifying the Most Important Predictors for Re-offending with OneR”

Cupid’s Arrow: How to Boost your Chances at Speed Dating!


During our little break, Valentine’s Day was celebrated. Yet for many, it was a depressing day because they are single and are looking for love.

Speed dating is a popular format (in times of Covid-19 also in virtual form) to meet many different potential soul mates in a short period of time. If you want to learn which factors determine “getting to the next round”, read on!
Continue reading “Cupid’s Arrow: How to Boost your Chances at Speed Dating!”

OneR in Medical Research: Finding Leading Symptoms, Main Predictors and Cut-Off Points


We already had a lot of examples that make use of the OneR package (install free from CRAN), which can be found in the respective Category: OneR.

Here we will give you some concrete examples from the area of research on Type 2 Diabetes Mellitus (DM) to show that the package is especially well suited in the field of medical research, so read on!
Continue reading “OneR in Medical Research: Finding Leading Symptoms, Main Predictors and Cut-Off Points”

ZeroR: The Simplest Possible Classifier… or: Why High Accuracy can be Misleading


In one of my most popular posts So, what is AI really? I showed that Artificial Intelligence (AI) basically boils down to autonomously learned rules, i.e. conditional statements or simply, conditionals.

In this post, I create the simplest possible classifier, called ZeroR, to show that even this classifier can achieve surprisingly high values for accuracy (i.e. the ratio of correctly predicted instances)… and why this is not necessarily a good thing, so read on!
Continue reading “ZeroR: The Simplest Possible Classifier… or: Why High Accuracy can be Misleading”

The One Question you should ask your Partner before Marrying!


Valentine’s Day is around the corner and love is in the air… but, shock horror, nearly every second marriage ends in a divorce! Unfortunately, I can tell you first hand that this is an experience you’d rather not have. In this post, we see how data science, in the form of the OneR package and an interesting new data set, might potentially help you to avoid that tragedy… so read on!
Continue reading “The One Question you should ask your Partner before Marrying!”

Data Science on Rails: Analyzing Customer Churn

Customer Relationship Management (CRM) is not only about acquiring new customers but especially about retaining existing ones. That is because acquisition is often much more expensive than retention. In this post, we learn how to analyze the reasons of customer churn (i.e. customers leaving the company). We do this with a very convenient point-and-click interface for doing data science on top of R, so read on!
Continue reading “Data Science on Rails: Analyzing Customer Churn”

Learning R: The Ultimate Introduction (incl. Machine Learning!)


There are a million reasons to learn R (see e.g. Why R for Data Science – and not Python?), but where to start? I present to you the ultimate introduction to bring you up to speed! So read on…
Continue reading “Learning R: The Ultimate Introduction (incl. Machine Learning!)”

Learning Data Science: Predicting Income Brackets


As promised in the post Learning Data Science: Modelling Basics we will now go a step further and try to predict income brackets with real world data and different modelling approaches. We will learn a thing or two along the way, e.g. about the so-called Accuracy-Interpretability Trade-Off, so read on…
Continue reading “Learning Data Science: Predicting Income Brackets”

So, what is AI really?


One of the topics that is totally hyped at the moment is obviously Artificial Intelligence or AI for short. There are many self-proclaimed experts running around trying to sell you the stuff they have been doing all along under this new label.

When you ask them what AI means you will normally get some convoluted explanations (which is a good sign that they don’t get it themselves) and some “success stories”. The truth is that many of those talking heads don’t really know what they are talking about, yet happen to have a friend who knows somebody who picked up a book at the local station bookshop… ok, that was nasty but unfortunately often not too far away from the truth.

So, what is AI really? This post tries to give some guidance, so read on!
Continue reading “So, what is AI really?