We already had a lot of examples that make use of the
OneR package (install free from CRAN), which can be found in the respective Category: OneR.
Here we will give you some concrete examples from the area of research on Type 2 Diabetes Mellitus (DM) to show that the package is especially well suited in the field of medical research, so read on!
Continue reading “OneR in Medical Research: Finding Leading Symptoms, Main Predictors and Cut-Off Points”
We already covered Neural Networks and Logistic Regression in this blog.
If you want to gain an even deeper understanding of the fascinating connection between those two popular machine learning techniques read on!
Continue reading “Logistic Regression as the Smallest Possible Neural Network”
Forecasting the future has always been one of man’s biggest desires and many approaches have been tried over the centuries. In this post we will look at a simple statistical method for time series analysis, called AR for Autoregressive Model. We will use this method to predict future sales data and will rebuild it to get a deeper understanding of how this method works, so read on!
Continue reading “Time Series Analysis: Forecasting Sales Data with Autoregressive (AR) Models”
We all know the classical Sci-Fi trope of intelligent machines becoming conscious and all the potential ramifications that could follow from there (free will, fighting their human creators, ethical dilemmas, and so forth). Now, is this a realistic scenario? As a researcher in the area of AI (see e.g. So, what is AI really?), with a penchant for philosophy, I share my thoughts here with you, so read on!
Continue reading “Will AI become conscious any time soon?”
In one of my most popular posts So, what is AI really? I showed that Artificial Intelligence (AI) basically boils down to autonomously learned rules, i.e. conditional statements or simply, conditionals.
In this post, I create the simplest possible classifier, called ZeroR, to show that even this classifier can achieve surprisingly high values for accuracy (i.e. the ratio of correctly predicted instances)… and why this is not necessarily a good thing, so read on!
Continue reading “ZeroR: The Simplest Possible Classifier, or Why High Accuracy can be Misleading”
One widely used graphical plot to assess the quality of a machine learning classifier or the accuracy of a medical test is the Receiver Operating Characteristic curve, or ROC curve. If you want to gain an intuition and see how they can be easily created with base R read on!
Continue reading “Learning Data Science: Understanding ROC Curves”
Valentine’s Day is around the corner and love is in the air… but, shock horror, nearly every second marriage ends in a divorce! Unfortunately, I can tell you first hand that this is an experience you’d rather not have. In this post, we see how data science, in the form of the
OneR package and an interesting new data set, might potentially help you to avoid that tragedy… so read on!
Continue reading “The One Question you should ask your Partner before Marrying!”
We already covered the so-called Accuracy-Interpretability Trade-Off which states that oftentimes the more accurate the results of an AI are the harder it is to interpret how it arrived at its conclusions (see also: Learning Data Science: Predicting Income Brackets).
This is especially true for Neural Networks: while often delivering outstanding results, they are basically black boxes and notoriously hard to interpret (see also: Understanding the Magic of Neural Networks).
There is a new hot area of research to make black-box models interpretable, called Explainable Artificial Intelligence (XAI), if you want to gain some intuition on one such approach (called LIME), read on!
Continue reading “Explainable AI (XAI)… Explained! or How to whiten any Black Box with LIME”
There seems to be some revolution going on in the R sphere… people seem to be jumping at what is commonly known as the tidyverse, a collection of packages developed and maintained by the Chief Scientist of RStudio, Hadley Wickham.
In this post, I explain what the tidyverse is and why I resist using it, so read on!
Continue reading “Why I don’t use the Tidyverse”
As we have already seen in former posts simple methods can be surprisingly successful in yielding good results (see e.g Learning Data Science: Predicting Income Brackets or Teach R to read handwritten Digits with just 4 Lines of Code).
If you want to learn how some simple mathematics, known as Naive Bayes, can help you find out the sentiment of texts (in this case movie reviews) read on!
Continue reading “Learning Data Science: Sentiment Analysis with Naive Bayes”