It’s a widely accepted notion that money influences happiness, a concept famously associated with Nobel laureate Daniel Kahneman, who purportedly demonstrated that emotional wellbeing increases with income but plateaus beyond an annual threshold of about $75,000.

This idea has permeated both academic circles and popular media, reinforcing the belief that there’s a direct correlation between financial prosperity and happiness. But how accurate is this belief when we scrutinize the data more closely? To find out read on!

You can also watch the video for this post (in German):

Recent research, indeed the last paper ever published by Kahneman, attempts to delve deeper into this relationship, suggesting that the connection between income and happiness is real and relevant. However, a closer examination of the publicly available data tells a different story.

My own analysis reveals a Pearson correlation coefficient of just 0.07 between wellbeing and income, indicating a very weak relationship despite its statistical significance:

data <- read.csv("Data/Income_and_emotional_wellbeing_a_conflict_resolved.csv") data$income <- as.integer(data$income) data$income_factor <- factor(data$income, levels = sort(unique(data$income))) data$income_1000 <- data$income / 1000 cor.test(data$wellbeing, data$income) ## ## Pearson's product-moment correlation ## ## data: data$wellbeing and data$income ## t = 13.293, df = 33389, p-value < 2.2e-16 ## alternative hypothesis: true correlation is not equal to 0 ## 95 percent confidence interval: ## 0.06187719 0.08321620 ## sample estimates: ## cor ## 0.072555

One of the more subtle learnings of academic research is that a relationship between two variables can statistically be highly significant while in practice being useless because the effect is so minuscule. Paradoxically, the more data points you have the higher the chances that you will find something *statistically significant* that has no *practical significance*.

In this case we have more than 33,300 data points and while there is a tiny increase in happiness with greater income, the effect is so slight that its real-world implications are negligible. Indeed, the difference between the medians of happiness at household incomes of $15,000 and $250,000 is only about five points on a 100-point scale!

To put the observed effect into a more relatable context, consider this: the difference in happiness resulting from an approximately fourfold difference in income is roughly equivalent to the happiness boost one might feel over a typical weekend! This comparison starkly illustrates the insignificance of income effects relative to everyday life experiences.

Yet this research manages to persuade us that there is indeed something substantial going on. How is this achieved in such studies? The devil is in the details, or in this case, the methodology. Three statistical choices in such studies stand out as particularly problematic: the logarithmic transformation of income, the use of z-scores and the use of averages without referring to dispersion measures for wellbeing:

**Logarithmic transformation**: By transforming income using a logarithmic scale, the data suggest a linear relationship where none exists. This transformation masks the reality of diminishing returns, where increases in income result in progressively smaller gains in happiness. These methods, while often applied in practice for reducing skewness, can present a distorted view of the underlying data. Apart from that, how should one interpret “log income” anyway?**Z-scores**: The application of z-scores is another area where the graphical representation can be misleading. Z-scores standardize data points and effectively cut off the y-axis of the original data, which can visually exaggerate minor differences. When we depict wellbeing scores on a complete 0-100 scale, the supposed effect of income on happiness nearly vanishes, revealing a much less compelling story.**Averages without dispersion measures**: While using median values (or means) is not problematic per se, it can mask the inherent dispersion of the data, e.g. to be indicated by interquartile ranges (IQR), standard deviations, variances, or confidence intervals. Especially when data is extremely dispersed, as in this case with wellbeing, interpreting results and drawing meaningful conclusions can be challenging without proper context.

The plots I’ve created from the original data starkly illustrate these points. I often start my own data analyses with a scatter plot but in this case, I first thought that I made a mistake or got the wrong data:

plot(data$wellbeing ~ data$income_1000, main = "Scatterplot of Wellbeing Across Income Levels", xlab = "Income (thousands of $)", ylab = "Wellbeing") grid()

This plot shows a dense cluster of data points that scatter broadly across the graph, displaying no apparent trend or meaningful pattern linking income to wellbeing.

But worry not, by making use of the three statistical techniques from above, it is quite easy to create plots like the ones shown in the pertinent literature:

mean_well_being_zscore <- aggregate(scale(wellbeing) ~ log_income, data = data, median) plot(mean_well_being_zscore, main = "Z-Score of Median of Wellbeing Across Log-Income Levels", xlab = "Log Income", ylab = "Wellbeing (z-scored)", pch = 16) grid() LinReg <- lm(V1 ~ log_income, data = mean_well_being_zscore) LinReg ## ## Call: ## lm(formula = V1 ~ log_income, data = mean_well_being_zscore) ## ## Coefficients: ## (Intercept) log_income ## -1.08108 0.09396 abline(LinReg)

Mirroring Figure 1B from the above paper, this plot suggests a clear relationship between rising levels of income and resulting wellbeing. Alas, upon closer scrutiny, this proves to be more a product of clever statistical handling than any relevant effect.

Now let us have a look at some less sophisticated plots to see what is really going on… or better, “isn’t going on”. We recreate the same plot but this time without the log transformation of income and without z-scoring the wellbeing values:

mean_well_being <- aggregate(wellbeing ~ income_1000, data = data, median) plot(mean_well_being, main = "Median of Wellbeing Across Income Levels", xlab = "Income (thousands of $)", ylab = "Wellbeing", pch = 16) grid()

plot(mean_well_being, main = "Median of Wellbeing Across Income Levels", xlab = "Income (thousands of $)", ylab = "Wellbeing", ylim = c(0, 100), pch = 16) grid()

The first version of this plot artificially caps the y-axis to suggest a strong, meaningful relationship; the second version showing the full axis demonstrates the near absence of any real effect. In this case the manipulation of the y-axis becomes more apparent because we can now see the real wellbeing values instead of the hard to interpret z-scored ones.

And these plots do not deceive the eye, e.g. nearly doubling one’s income from $35,000 to $65,000 shows not even a statistically significant difference in the level of wellbeing:

t.test(data$wellbeing[data$income == 35000], data$wellbeing[data$income == 65000]) ## ## Welch Two Sample t-test ## ## data: data$wellbeing[data$income == 35000] and data$wellbeing[data$income == 65000] ## t = -1.8237, df = 4923.9, p-value = 0.06825 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -1.32549954 0.04789018 ## sample estimates: ## mean of x mean of y ## 62.37602 63.01482

Or even more extreme, the effect of more than quadrupling one’s income from $137,500 to $625,000 isn’t statistically significant either:

t.test(data$wellbeing[data$income == 137500], data$wellbeing[data$income == 625000]) ## ## Welch Two Sample t-test ## ## data: data$wellbeing[data$income == 137500] and data$wellbeing[data$income == 625000] ## t = -1.6919, df = 533.15, p-value = 0.09125 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -2.4851637 0.1852509 ## sample estimates: ## mean of x mean of y ## 64.19116 65.34111

In the last chart we now also add a measure of dispersion in the form of boxplots. Here it becomes even more clear that across income levels, wellbeing hardly changes and is extremely dispersed at that:

boxplot(wellbeing ~ income, data = data, main = "Boxplot of Wellbeing Across Income Levels", xlab = "Income ($)", ylab = "Wellbeing", col = rainbow(length(unique(data$income))))

To be fair, the authors do briefly address some of these criticisms, but those discussions are buried deep within the paper and serve mainly to downplay their significance. It’s important to remember that “lying with statistics” doesn’t necessarily involve outright falsehoods; rather, it involves presenting results in a way that suggests misleading conclusions or exaggerates irrelevant findings.

As a sidenote, the possibility of reverse causality — where inherently happier individuals might earn more — should also be considered. It suggests that personal disposition (what self-proclaimed “life-coaches” call “mindset” nowadays!) might drive both happiness and higher income rather than the reverse. It would be interesting to see the results if you reversed both variables. Moreover, it would be insightful to examine how *changes* in income levels affect wellbeing, as the current research only addresses the wellbeing of individuals at their *existing* income levels.

To conclude, this critique is not just about debunking a popular myth; it’s a call for greater integrity and clarity in how statistical research is conducted and reported. I would be very interested in your feedback and in whether you have encountered similar overstatements in other research.

**UPDATE April/May 2024**

My post has garnered a lot of attention on LinkedIn and especially good feedback, some excerpts:

Professor Przemyslaw Biecek, Warsaw University of Technology:

A very nice data-drive discussion with the ‘Kahneman last paper’! Will share with my students

Adrian Olszewski, Biostatistician:

Absolute masterpiece – both side: in trying to “find a way to show something that doesn’t exist” and in revealing the sad truth 🙂 I saw several such “almost rectangular curtains” with a “jagged corner” which created spurious pattern (usually increasing), while actually nothing happened.

Adam Tarnawski, Senior Advanced Analytics Expert, Mercer:

That’s a good representation how data can be manipulated. Theres’ a bigger issue of replicability in social science and efforts how to address that. […] very good job. There should be an obligatory review by a statistician as part of the peer review process.

Dr Björn Walther, Consultant, YouTuber:

Well done! Straight to the point of how a simple manipulation of the y-axis of a graph and a few log transformations / z-standardizations will give the result one was (not) looking for.

Jan Eggers, Digital Journalist at Hessischer Rundfunk:

Already shared – and: I will definitely show this to my data journalism students today.

Dr Paul Bilokon, Quant of the Year 2023, Visiting Professor, Imperial College:

Interesting analysis.

Michael Srb, CFA, Expert in Quantitative Finance:

Great Article as always! With these type of questions you can go wrong in so many ways… especially after transforming data.

Robert D. Brown III, Author of “Business Case Analysis with R”:

This is a really interesting analysis. Thanks for sharing this. […] your article kicked off more focused thinking and reflection than any other LinkedIn post I’ve read recently. I really do appreciate that.

Bobby Markov, Senior Data Analyst, Ph.D. candidate:

]]>I just love the Boxplot of Wellbeing Across Income Levels that was shown, such a clear message.

In an era where artificial intelligence (AI) is increasingly permeating various aspects of our lives, the academic world is also faced with the challenge of dealing with this rapid technological development. This is particularly true regarding final theses and term papers, raising the question of how we, as educational institutions, should handle the use of foundation models like ChatGPT, Google Gemini, and other language-based models (LLMs).

It would be illusory and counterproductive to ban the use of these technologies in academic works. Such a prohibition would not only be out of touch with reality but would also deprive our students of valuable learning experiences.

The use of AI detectors as a means to identify AI-generated texts cannot provide a reliable solution either. These detectors are often unreliable and frequently lead to false-positive results. Studies have shown that especially students with a migration background are often unfairly disadvantaged, with their texts wrongly marked as AI-generated. Such tools contribute to discrimination and undermine trust in the fairness of academic evaluations.

Instead, we should encourage our students to use AI confidently, creatively, and responsibly.

However, this requires an adjustment in how we select and evaluate topics for theses. Lecturers should adjust the choice of topics and the required level of work upwards, making the use of AI not only possible but indeed necessary. What can be created by simply pressing a button on ChatGPT will not be highly valued or rewarded in the future – nor should it be.

Rather than unrealistic prohibitions, we should instead incentivize students to disclose their use of AI through detailed documentation, showing how and where they have used AI in their work. A creative and, yes, intelligent use of AI can and should even lead to a better grade, as it reflects a skill that will be indispensable in the future workplace.

Finally, I see the need to place greater emphasis on the oral defense of theses. In an age where the boundaries between human and machine intelligence are increasingly blurring, evaluating solely based on written work can be misleading. An oral defense provides insight into the students’ deeper understanding and critical engagement with their topic.

Overall, it is time we view AI not as a threat but as an opportunity to improve the quality of our academic education and prepare our students for a future where AI will play a central role.

*The author, Holger K. von Jouanne-Diedrich, Ph.D., is Founding Professor of the Competence Centre for Artificial Intelligence at the Technical University Aschaffenburg, Germany, and Fellow for Artificial Intelligence at the Think Tank Zukunft-Fabrik.2050 of the Alumni Organization of the University of St. Gallen, Switzerland.*

In the realm of business and leadership, one statistical phenomenon often goes unrecognized yet significantly influences our understanding of performance and success. This is the concept of

You can also watch the following analysis as a video (in German):

Reversion to the mean refers to the statistical tendency for extreme or unusual results to be followed by more typical or average outcomes. In a business context, this might mean that an exceptionally high or low performance period is likely to be followed by results closer to the average. Understanding this concept is crucial for fair and accurate performance assessment.

One of the most telling historical examples of misinterpreting reversion to the mean comes from the U.S. military’s pilot training during World War II. Instructors observed that harsh criticism of a pilot’s poor performance often led to better results in subsequent sessions, while praise for good performance frequently preceded a decline. This led to the mistaken belief that punishment was more effective than praise.

In reality, what the instructors witnessed was a classic case of reversion to the mean. A pilot performing poorly on one day (below their average) was statistically likely to perform closer to their average (better) the next day, regardless of the criticism. Similarly, a pilot performing exceptionally well (above their average) was likely to return to more typical performance levels subsequently.

Let us put this in some broader context: Douglas McGregor’s Theory X and Theory Y present two contrasting views of workforce motivation. Theory X posits that employees are inherently lazy and require strict supervision and control, while Theory Y suggests that employees are self-motivated and seek responsibility:

Aspect | Theory X | Theory Y |
---|---|---|

View of Human Nature | Assumes people inherently dislike work and avoid it if possible. | Believes people view work as natural and potentially enjoyable. |

Motivation | People need to be coerced, controlled, or threatened with punishment. | People are self-motivated and can find satisfaction in work. |

Leadership Style | Authoritarian; relies on strict supervision and a rigid hierarchy. | Participative; encourages collaboration and decision-making. |

Employee Potential | Assumes people prefer to be directed and avoid responsibility. | Believes people seek responsibility and have creativity to offer. |

Work Environment | Focuses on rules, tight control, and close supervision. | Fosters a climate of trust, creativity, and autonomy. |

The misinterpretation of reversion to the mean can erroneously bolster Theory X. When a manager sees performance improvement following punitive measures, they might attribute this to their authoritarian approach, not realizing that this improvement is part of a natural statistical trend:

# Defining the performance scores based on a sawtooth pattern performance_scores <- c(70, 80, 70, 60, 70, 80, 70, 60, 70, 80) # Setting up the plot plot(performance_scores, type = 'o', col = 'blue', pch = 19, ylim = c(55, 85), xlab = 'Performance Assessment', ylab = 'Performance Score', main = 'Praise and Punishment: Reversion to the Mean') # Adding a line for the true average performance abline(h = 70, col = 'blue', lty = 2) # Adding labels for praise (above 70) and punishment (below 70) for (i in 1:length(performance_scores)) { if (performance_scores[i] > 70) { text(i, performance_scores[i], 'Praise', pos = 3, col = 'green') } else if (performance_scores[i] < 70) { text(i, performance_scores[i], 'Punish', pos = 1, col = 'red') } }

The above plot demonstrates how performance fluctuates around a mean, leading to potential misinterpretations. Managers observing this pattern might mistakenly attribute the improvements to their strict measures (punishment) and the declines to leniency (praise), reinforcing a Theory X approach.

This misinterpretation is a form of the common logical fallacy known as “post hoc ergo propter hoc,” which is Latin for “after this, therefore because of this.” This fallacy occurs when someone assumes that because one event followed another, the first event must have caused the second.

Another intriguing manifestation of reversion to the mean is the so-called “Time Magazine Cover Curse”. This phenomenon suggests that individuals, companies, or stocks featured on the cover of Time (or similar publications) often experience a subsequent decline. The curse is actually a classic example of reversion to the mean.

Entities that make the cover usually do so because of extraordinary success or notoriety, levels which are inherently difficult to sustain. As performance reverts to the average, it appears as though the cover feature heralded a downfall. Nobel laureate Paul Krugman once joked, “Whom the Gods would destroy, they first put on the cover of Business Week.” Some technical analysts even use what they call a “magazine cover indicator” as a contrarian trading signal.

Speaking of investing: of course, reversion to the mean is here at play too! Just look at the best performing funds of one year just to find them in the average (or worse) section the next… that is reversion to the mean in action!

Understanding reversion to the mean is crucial for leaders, managers, and even investors. It teaches us to look beyond immediate outcomes and consider longer-term trends. It also warns against the hasty attribution of success or failure to specific actions or strategies. Recognizing the influence of this statistical principle can lead to better assessment of performance and success and fairer, more effective management practices.

]]>The geopolitics of the Middle East has always been complex, with its share of conflicts and political unrest. Numerous theories have been proposed to dissect the underpinnings of the region’s political instability.

One such theory, known as the Youth Bulge Theory, asserts that a high proportion of young people within a population can lead to political instability and even violence. This theory could provide an illuminating perspective on the dynamics of the Middle East conflicts.

If you want to understand this most important ingredient of the ongoing conflict, read on!

You can also watch the following analysis as a video (in German):

The Youth Bulge Theory posits that countries with a high percentage of young individuals are more susceptible to political instability and violence. The rationale behind this theory is that a large cohort of young people can create increased competition for limited resources like jobs and housing. If these needs remain unmet, the youth may become a fertile ground for political agitation, or worse, recruitment into extremist factions.

The Youth Bulge Theory was already extensively discussed on Learning Machines and used to analyse Afghanistan, Russia, and China.

To validate the Youth Bulge Theory, we can turn to data. By analyzing the proportion of young people against a political stability indicator across different countries, we might discern a pattern to substantiate this theory.

We’ve employed data from the World Development Indicators database, specifically focusing on the proportion of young people and a political stability indicator:

library(WDI) # install from CRAN age0_14 <- WDI(indicator = "SP.POP.0014.TO.ZS", start = 2021, end = 2021) # proportion of 0-14 year olds pol_stab <- WDI(indicator = "PV.EST", start = 2021, end = 2021) # political stability indicator data <- merge(age0_14, pol_stab)[c(1, 5, 6)] |> na.omit() colnames(data) <- c("country", "age0_14", "pol_stab") plot(data$age0_14, data$pol_stab, col = "darkgrey", xlab = "Proportion of young people (in %)", ylab = "Political stability", main = "Youth Bulge Theory") abline(v = 23.2, col = "red") # cut-off value abline(h = 0, col = "red") lm_out <- lm(pol_stab ~ age0_14, data = data) abline(lm_out, col = "black", lwd = 3) country <- "West Bank and Gaza" points(data$age0_14[data$country == country], data$pol_stab[data$country == country], col = "darkgreen", lwd = 8) text(data$age0_14[data$country == country], data$pol_stab[data$country == country], labels = country, pos = 4) country <- "Israel" points(data$age0_14[data$country == country], data$pol_stab[data$country == country], col = "blue", lwd = 8) text(data$age0_14[data$country == country], data$pol_stab[data$country == country], labels = country, pos = 4) country <- "Germany" points(data$age0_14[data$country == country], data$pol_stab[data$country == country], col = "orange", lwd = 8) text(data$age0_14[data$country == country], data$pol_stab[data$country == country], labels = country, pos = 4)

In the plot, each point represents a country, with its position determined by the proportion of young people and its political stability score. The red lines indicate the threshold where the situation toggles between stability and instability. A black line, derived from a linear regression analysis, depicts the statistically highly significant strong negative correlation of over 60% between the proportion of young people and political stability.

While the original Youth Bulge Theory uses the age bracket of 15 to 24, we use the age bracket betweenn 0 and 14 as a proxy. The reason is lack of data availability but both brackets are highly correlated and demographic structures are stable over longer periods of time.

A closer look at three distinct data points – West Bank and Gaza, Israel, and Germany – reveals an intriguing narrative. West Bank and Gaza, with 39% of its population being young, shows a political stability score of -1.84, indicating a highly unstable political environment. Israel, with a slightly lesser youth proportion of 28%, has a political stability score of -1.06, still signifying instability but to a lesser degree than the West Bank and Gaza. On the other end of the spectrum, Germany, with a youth proportion of 14%, exhibits a political stability score of 0.76, indicating a stable political environment.

The data-driven analysis aligns with the Youth Bulge Theory, suggesting that the high proportion of young individuals could be contributing to the political instability witnessed in the Middle East. However, it’s crucial to acknowledge that this theory doesn’t operate in a vacuum. The region’s political dynamics are influenced by a myriad of factors including historical grievances, religious tensions, and geopolitical interests.

Moreover, comparing the Middle East situation to historical European conflicts reveals both similarities and disparities. For instance, the long-standing rivalry between Germany and France, which waned as the demographic structure evolved, hints at a hopeful prospect for the Middle East. Yet, the unique intricacies of the Middle East conflicts necessitate a nuanced approach in drawing parallels and deriving solutions.

In summary, the Youth Bulge Theory provides a compelling lens through which to analyze the ongoing turmoil in the region. While it’s crucial to recognize that the youth bulge is not the sole driver of conflict, its impact on political stability cannot be underestimated.

**DISCLAIMER**

*The views and opinions expressed in this blog post are those of the author and do not necessarily reflect the official policy or position of any organization. This blog post is for informational purposes and is an analysis based on available data, it is not intended to be an exhaustive explanation. The subject matter is highly sensitive and complex, and readers are encouraged to consider multiple perspectives when evaluating the issues.*

Election polls play a crucial role in predicting the outcome of elections and shaping public opinion. However, it’s important to understand that the results of any single poll should be taken with a grain of salt.

Many polls only ask about 1,000 people about their political preferences, which is quite small in comparison to the often millions of voters. So, how reliable are those results? Or in other words, how confident can we be in the results? To understand some of these intricacies, read on!

You can also watch the video for this post (in German):

In this blog post, we will explore what *confidence intervals* are, why they are important in election polling, and how to calculate them using an example with three parties, one with 50% of the vote, one with 25% and one with 5%. We’ll also use R code to simulate the *margin of error* for each party and demonstrate the difference in uncertainty between the three. By the end of this post, you’ll have a better understanding of how to interpret election polls and the limitations of political forecasting.

Confidence intervals have a rather unintuitive technical definition, but the general idea is that they are a way to express the uncertainty surrounding an estimate. For example, in the context of election polls, a poll might estimate that a certain party will receive x% of the vote, but with a margin of error of ±y%. This means, that if the poll were repeated many times, and a confidence interval calculated from each sample, we would expect that 95% (i.e. the confidence level) of these intervals would contain the party’s true support.

So, intuitively speaking, the confidence interval gives us a range of values that we are fairly confident contains the true value of what we are trying to estimate. In this case, the true level of support for a party. The wider the confidence interval, the less certain we are about the estimate. The narrower the confidence interval, the more certain we are.

It’s important to note that a confidence interval does not guarantee that the true value will be within the range, just that 95% of all confidence intervals that were calculated this way will cover the true value.

Another kind of interval, which gives the probability that the estimated parameter is covered with, for example, a 95% probability, is a *Bayesian credible interval*. Interestingly enough, while their interpretation is quite different (and admittedly more intuitive), in our specific case, the results are not too far apart from each other. Therefore, the more intuitive, but commonly misinterpreted, view of frequentist confidence intervals wouldn’t be too far off in this instance.

Now, for another important point: the relative margin of error (relative with respect to the support of the party) would increase if you’re estimating support for a party with 5% of the vote compared to a party with 50% of the vote. This is because the margin of error is affected by the sample size and the level of variability in the data. When the sample size is small or the level of variability is high, the relative margin of error will be larger.

In the case of a party with 5% of the vote, the sample size of voters supporting that party will be smaller, leading to a higher level of variability. This, in turn, will result in a larger relative margin of error, meaning the range of possible values for the estimate of the party’s support will be wider.

To put it another way, it’s easier to estimate the level of support for a party with 50% of the vote because there are many more voters to sample from and the results are less variable. On the other hand, it’s harder to estimate the level of support for a party with 5% of the vote because there are fewer voters to sample from and the results are more variable.

So, if all else is equal (same level of confidence, same sample size, etc.), the relative margin of error would be larger for a party with 5% of the vote compared to a party with 50% of the vote.

Let us demonstrate this effect with R by simulating 10,000 sample polls with 1,000 respondents for each party:

set.seed(123) # for reproducibility # poll_sim function poll_sim <- function(p = 0.25, n = 1000, n_sim = 10000, conf_int = 0.95) { # Z-score for the given confidence level z_score <- qnorm(1 - (1 - conf_int) / 2) # Store the results of the simulations results <- rep(NA, n_sim) # Run the simulations for (i in 1:n_sim) { # Generate a sample of binary responses (0 or 1) sample <- rbinom(n, size = 1, prob = p) # Calculate the proportion of 1's (i.e., the estimate of p) results[i] <- mean(sample) } # Calculate the margin of error moe <- z_score * sqrt(p * (1 - p) / n) # Plot the results hist(results, main = paste0("Party with Proportion ", p * 100, "%"), xlab = "Estimated Proportion", col = "blue") abline(v = p, col = "red", lwd = 3) abline(v = p + moe, col = "green", lty = 2, lwd = 3) abline(v = p - moe, col = "green", lty = 2, lwd = 3) # Show margin of error cat("Absolute Margin of Error:", round(moe, 3), "\n") cat("Confi. Int. from", round(p - moe, 3), "to", round(p + moe, 3), "\n") cat("Relative Margin of Error:", round(moe / p, 3), "\n") } # Parameters p1 <- 0.5 # True proportion of votes for party 1 (50%) p2 <- 0.25 # True proportion of votes for party 2 (25%) p3 <- 0.05 # True proportion of votes for party 3 (5%) n <- 1000 # Sample size # Number of simulations n_sim <- 10000 # Run the simulations poll_sim(p1)

## Absolute Margin of Error: 0.031 ## Confi. Int. from 0.469 to 0.531 ## Relative Margin of Error: 0.062

So, in this case, the absolute margin of error is indeed ±3%.

poll_sim(p2)

## Absolute Margin of Error: 0.027 ## Confi. Int. from 0.223 to 0.277 ## Relative Margin of Error: 0.107

In the case of a party with a true proportion of 25% the relative margin of error becomes over 10%!

poll_sim(p3)

## Absolute Margin of Error: 0.014 ## Confi. Int. from 0.036 to 0.064 ## Relative Margin of Error: 0.27

So, while the absolute margin of error is much smaller in this case (about half of the first case), the relative level of uncertainty is much bigger (nearly 30%)!

In conclusion, confidence intervals are a crucial tool in understanding the uncertainty of election polling results. Our simulation of the margin of error for three parties highlights the importance of considering the level of uncertainty when interpreting election polls. We hope this post has provided you with a clearer understanding of confidence intervals and how to apply them to election polling.

]]>Weather and climate are words often used interchangeably in casual conversations. But when a chilly summer breeze sweeps through in July, or when unexpected rains dampen our winter holidays, the age-old debate resurfaces: “Is climate change even real?”

If you want to dive deep into the fascinating realm of chaotic systems to unravel this enigma and distinguish between weather and climate, read on!

You can also watch the video for this post (in German):

The idea to illustrate climate change with a biased Lorenz system comes from the excellent book “The Primacy of Doubt: From Quantum Physics to Climate Change” by my colleague Tim Palmer of the University of Oxford. Professor Palmer is one of the most renowned climate scientists. This blog post, including the R code, was written with the assistance of ChatGPT-4.

At the heart of our understanding of weather and climate lies chaos theory. Both weather and climate are inherently chaotic systems. This means that while they follow deterministic physical laws and equations, their behavior can be wildly unpredictable in the short term due to often extreme non-linear behaviour (the so-called butterfly effect).

Let us illustrate this with another, much simpler chaotic toy-model, the well-known Lorenz system:

(1)

Consider the oscillations of the X variable in these equations as representing our day-to-day weather. Some days are stormy, others sunny, and predicting whether next weekend would be ideal for a beach day can be quite the gamble.

However, when we zoom out and view these oscillations over extended periods, a pattern emerges. This averaged out behavior, where the X variable spends equal time being positive and negative, depicts our climate. It’s the long-term behavior of the atmosphere, predictable in its broader strokes, even if the day-to-day details remain elusive.

But here’s where the plot thickens. What if our equations, our system, receives an external push?

(1)

This slight modification *C* represents external influences, akin to increased greenhouse gases in our real world. And just like that, our system’s behavior changes. Our Lorenz Attractor’s balance is disturbed.

# Load necessary library library(deSolve) # Parameters for the Lorenz System parameters <- c(sigma = 10, rho = 28, beta = 8/3) # Initial conditions initial_conditions <- c(X = 2, Y = 20, Z = 0) # Time span t_span <- seq(0, 100, by = 0.01) # Lorenz System Function (Unbiased) Lorenz <- function(t, state, parameters) { with(as.list(c(state, parameters)),{ dX <- sigma * (Y - X) dY <- X * (rho - Z) - Y dZ <- X * Y - beta * Z list(c(dX, dY, dZ)) }) } # Modified Lorenz System Function (Biased) ModifiedLorenz <- function(t, state, parameters) { with(as.list(c(state, parameters)),{ dX <- sigma * (Y - X) + C dY <- X * (rho - Z) - Y + C dZ <- X * Y - beta * Z list(c(dX, dY, dZ)) }) } # Simulate the unbiased Lorenz system unbiased_solution <- ode(y = initial_conditions, times = t_span, func = Lorenz, parms = parameters) # Determine the best C value for biased version best_C <- 3.920065 parameters_biased <- c(parameters, C = best_C) # Simulate the biased Lorenz system biased_solution <- ode(y = initial_conditions, times = t_span, func = ModifiedLorenz, parms = parameters_biased) # Plotting # Unbiased Lorenz System plot(unbiased_solution[ , "time"], unbiased_solution[ , "X"], type = "l", col = "blue", ylab = "X", xlab = "Time", main = "Unbiased Lorenz System")

# Biased Lorenz System plot(biased_solution[ , "time"], biased_solution[ , "X"], type = "l", col = "blue", ylab = "X", xlab = "Time", main = paste("Biased Lorenz System with C =", best_C))

Our day-to-day weather remains as unpredictable as ever. However, the overall climate shifts. It might lean towards being warmer on average or perhaps wetter. This skew in the average is climate change in action.

So, the next time someone remarks, “It’s such a cold summer; where’s that global warming?”, remember the Lorenz Attractor. A cold day, week, or even month doesn’t negate the long-term trends of climate change. Just as a single dance move doesn’t define the entire performance, individual weather events don’t dictate the broader climate narrative.

Weather is what we experience daily, with its unpredictable twists and turns. Climate, on the other hand, is the average of these twists over decades. And as for climate change? It’s the subtle yet profound shift in this long-term dance, shaped by external forces.

Note: It’s essential to understand that while individual weather events might seem contrary to the global warming narrative, they are merely pieces of a vast and complex climatic puzzle. And although we cannot attribute individual weather events to climate change, as we gather more pieces, the bigger picture of climate change becomes undeniably clear.

]]>The stock market is one of the most complex systems we know about. Millions of intelligent, highly competitive people (and increasingly AIs) try to outwit each other to earn as much money as possible.

In this post we build a simulation where little agents employ different trading strategies on an artificial stock market to replicate key *stylized facts* of real financial markets, so read on!

You can also watch the video for this post (in German):

This blog post is based on the working paper “A simple agent-based financial market model: direct interactions and comparisons of trading profits” by my colleague Professor Frank Westerhoff from the University of Bamberg. The model was originally partly implemented in NetLogo, a multi-agent programmable modeling environment by Marek Spisak and Roman Sperka as a community model.

It was expanded by the author and later completely transferred into the R programming language with the assistance of ChatGPT-4. The full script can be downloaded here: simple_market_model. Some parts of this post were also written with ChatGPT’s assistance.

There are basically three components of the simulation:

- traders who follow different trading strategies, i.e. technical traders and traders, who employ fundamental analysis,
- communication between the traders, which can be thought of as e.g. informal talks between friends in a pub, coaching programs, and/or social media,
- the stock exchange with a clearing mechanism where supply and demand meet to derive market prices.

We will simulate this market with 500 traders for a time span of about 20 years and record the evolution of prices/returns and the proportion of traders who follow each strategy. After that we are going to analyze the data. All of these parameters can of course be adjusted.

Let us start by explaining each of those components:

At its core there are two basic types of stock analysis: *technical analysis* and *fundamental analysis*. Technical analysis in many cases tends to *amplify stock price trends* because many technical traders buy when stocks are rising and sell when they fall. Fundamental analysis on the contrary often shows *mean-reverting behaviour* because its adherents try to buy undervalued stocks and sell overvalued ones (this fascinating interplay is examined in more depth here: The Big Picture: Technical + Fundamental Analysis = Buy-and-Hold).

For the sake of simplicity, we assume that traders using fundamental analysis are able to accurately determine the true fundamental value, which we presume to be constant at 1 in our simulation.

Now, traders are not stuck with one strategy but can adapt. In our simulation traders talk to each other and when they see that one strategy performs exceptionally well, they *switch with a certain probability*.

The interactions of the traders and their (resulting) trading strategies lead to buy and sell orders which need to be cleared on the market. *Excess demand will drive prices up* while *excess supply will drive prices down*.

All of this is quite intuitive and seems to reflect some real characteristics of markets, traders, and their interactions… Now, let’s see how this world unfolds: Let the games begin!

The fascinating thing is that this simple market simulation shows many so-called *stylized facts* of real market dynamics. We will explain those by looking at some resulting plots of the simulation.

Our first plot exhibits the evolution of log prices:

The undulating graph takes us on a journey through a simulated time span of about 20 years, replicating the roller coaster of emotions experienced by traders in real markets. One can observe periods of rapid growth, punctuated by sudden crashes. In the first 4 years (up until around the 1000th day mark), there are two major crashes, followed by a period of relative calm, comparable to a phase from the mid-1980s to 2007 called the *Great Moderation*.

But as we can see, this calm period is no guarantee of continued stability, as demonstrated by the boom and subsequent bust around the 3000 to 3325 mark. These patterns mirror the *bubbles* and *crashes* often seen in real stock markets.

This brings us to a key concept in financial markets, known as the *random walk* hypothesis. The autocorrelation plot of returns essentially shows no significant (or only spurious) autocorrelation, similar to real-world stock market returns:

In layman’s terms, this means that the returns from one day to the next are largely unpredictable and show no clear pattern.

Another remarkable aspect shown by our simulation is the so-called *fat tails*. The distribution of returns reveals that there is more probability mass in the center and in the tails of the distribution, and less in the shoulder parts compared with the normal distribution (in red):

This behaviour is indicative of more frequent extreme price changes (either gains or losses), a characteristic often observed in actual financial markets. On the other hand, there are also many more days than expected when not very much happens.

Let’s have a look at the evolution of returns:

First, all the fluctuations we observe in the simulation are, by definition, *excess volatility*. We call it “excess” because we have assumed a constant fundamental value of 1 for our assets. Yet, prices fluctuate around this constant value due to the differing trading strategies of our agents. This mirrors real markets, where prices fluctuate due to various factors despite the fundamental value of assets not having changed much.

Another thing can be discerned from this plot, there are periods of relative calm, alternating with periods of large price swings. This phenomenon is called *volatility clustering*. In our simulation, the autocorrelation of absolute returns shows significant autocorrelation even up to the 100th day:

This means that periods of high volatility (large price changes) tend to be followed by periods of similarly large price changes (high volatility), no matter their direction, and calm periods by more calm, forming thereby different volatility regimes, again a fascinating pattern observable in real markets (and one that could potentially be exploited, see Building and Backtesting a Volatility-based Trading Strategy with ChatGPT).

In summary, it’s fascinating that our simple multi-agent model can capture several key stylized facts of real market dynamics: the random walk, bubbles and crashes, fat tails, excess volatility, and volatility clustering. Despite our assumptions’ simplicity, our model demonstrates the complex behaviors seen in the real world, highlighting the power of agent-based models in understanding and potentially predicting market behavior. It’s a testament to the old saying that “more is different”: simple rules at the micro-level can lead to complex, emergent phenomena at the macro-level.

We look forward to doing cool stuff with our little model, e.g. backtesting different trading strategies in future posts. Please also share your experiments with the model in the comments below and stay tuned!

**DISCLAIMER**

*This post is written on an “as is” basis for educational purposes only and comes without any warranty. The findings and interpretations are exclusively those of the author and are not endorsed by or affiliated with any third party.*

In particular, this post provides no investment advice! No responsibility is taken whatsoever if you lose money.

*(If you make any money though I would be happy if you would buy me a coffee… that is not too much to ask, is it? )*

In this post, let us rise into the air to have a good view of the stock market. From this vantage point, seemingly unrelated things all of a sudden become connected and patterns hidden by all the buzz and noise start to appear!

If you want to understand the big picture of technical and fundamental analysis, its relation to the payoffs of certain option strategies, and what this has to do with buy-and-hold, read on!

The financial markets are a playground for diverse strategies, each with its own unique perspective and methodology. Among these, technical and fundamental analysis stand out as two distinct yet profoundly interrelated approaches. By digging deeper, we find that these two techniques, though appearing different on the surface, are two sides of the same coin, creating a fascinating interplay that leads to a classic investment strategy: buy-and-hold.

Parts of this post were written with the help of ChatGPT-4, concept and ideas are my own. The payoff diagrams were created with R, you can find the code here: Financial X-Rays: Dissect any Price Series with a simple Payoff Diagram.

First, let’s uncover the essence of these two approaches. On one side, we have *technical analysis*, which involves studying price movements and market trends. Technical traders rely on charts and various statistical measures to determine points of entry and exit in the market. They believe in the power of momentum, buying when prices rise and selling when they fall, under the premise that what goes up will continue to rise, and vice versa. This behaviour tends to make stock price movements in either direction more extreme.

On the other side is *fundamental analysis*, which seeks to evaluate securities by measuring the intrinsic value of a company. This is done by examining related economic, financial, and other qualitative and quantitative factors. When a stock’s price falls, fundamental traders see an opportunity: they believe the stock is undervalued, and that its price will revert to its fundamental value in the long run. This mean-reverting philosophy creates a stabilizing effect on stock prices.

The aforementioned approaches might seem diametrically opposed, but they are not. Instead, they offer two contrasting lenses through which to view the market. Let’s use an options trading analogy to illustrate this.

Technical traders can be likened to being *long a call option*. When you buy a call option, you have the right, but not the obligation, to buy an asset at a predetermined price within a specified period. It’s a bullish strategy, expecting the stock’s price to rise, similar to the momentum-driven philosophy of technical traders. When the stock falls below a certain point technical traders tend to sell illustrated by the flat line on the left, so that they won’t lose any more money, no matter how far the stock falls.

In contrast, fundamental traders can be thought of as being *short a put option*. When you look at the payoff diagram you see a cap on the right where you stop earning money from a certain point on, no matter how far the stock rises, which boils down to the same effect as selling the stock at that point. This resonates with the belief of fundamental traders that an undervalued stock will return to its fundamental value where they will tend to sell it.

When we combine these two perspectives, an intriguing scenario unfolds.

The combination of long call (representing the technical analysis approach) and short put (symbolizing the fundamental analysis approach) results in being *long the stock* (something also known as *put-call parity* in option lingo). This is a situation where the investor owns the stock and will hold it no matter how its price develops — a strategy commonly known as *buy-and-hold*.

This takes us to an interesting conclusion: technical and fundamental analysis can be seen as two integral parts of a holistic investment approach. While technical analysis represents the momentum-driven, trend-following aspect (long call), fundamental analysis brings in the mean-reversion, value-investing perspective (short put). Together, they form the age-old, tried-and-true strategy: buy-and-hold.

So, combining those two *active investing* approaches is equivalent to *passive investing*! Which makes sense, when you think about it: when you combine all possible investment strategies you must end up with the market as a whole.

In the next post, we will use these insights to build a full-fledged multi-agent simulation of the stock market with R, so stay tuned!

This addendum explores the implications of viewing technical trading as replicating a long call option and fundamental trading as its converse. In technical trading, akin to a long call, investors might sell slightly below and buy a bit above the strike price, incurring minor losses during price fluctuations near the strike. This boils down to the old “Cut your losses and let your profits run”. Conversely, fundamental traders tend to systematically gain in these scenarios, basically by, counterintuitively, doing the opposite: “Cut your profits and let your losses run”!

Two key outcomes arise from this analogy:

**Skewness of Returns**: Technical trading is likely to yield positively skewed returns reflecting gains when the market significantly rises but limiting losses otherwise. In contrast, fundamental trading might exhibit negatively skewed returns, benefiting from regular small gains but risking occasional large downturns.**Option Premium Analogy**: The concept aligns with the mechanics of option premiums. Selling options involves receiving a premium, akin to the small gains of fundamental trading, as compensation for assuming risk. Similarly, buying options, mirrored in technical trading, entails paying a premium (like an insurance premium), in this case, the slight losses during price fluctuations.

One last point: the expected return of each trader in this simplified market with no interest is 0 (zero) but the distribution of those returns is different for both. While the technical trader could make huge gains when the market rises, these are worn off by little losses when the price is wiggling around the fair value (= strike). The situation is just the opposite with fundamental traders, that often constantly earn some “premium” but bear the risk of a huge selloff. These dynamics lead to the respective skews in the returns above.

The problem with negatively skewed trading strategy is that in the event of a huge selloff many traders will lose their nerve and realize the loss. This is one of the reasons so many day traders (about 80%) lose money in the long run, some of them stopping trading for good after such a catastrophic loss (that is not to say that you should hold your investments no matter what! The real world is considerably more complex). Another reason could be margin calls so that the bank liquidates their position and thereby locks the loss in.

In the hedge fund world (oftentimes hugely) negatively skewed trading strategies are sometimes called “picking up nickels in front of a steam roller”, for obvious reasons.

Overall, this perspective integrates characteristics of options theory with trading strategies, highlighting risk-reward dynamics and the subtle costs of market entry and exit strategies.

**DISCLAIMER**

*This post is written on an “as is” basis for educational purposes only and comes without any warranty. The findings and interpretations are exclusively those of the author and are not endorsed by or affiliated with any third party.*

In particular, this post provides no investment advice! No responsibility is taken whatsoever if you lose money.

*(If you make any money though I would be happy if you would buy me a coffee… that is not too much to ask, is it? )*

**UPDATE June 6, 2023**

I created a video for this post (in German):

**UPDATE August 15, 2023**

The follow-up post is now online: Can a Simple Multi-Agent Model Replicate Complex Stock Market Behaviour?

R is a powerful programming language and environment for statistical computing and graphics. In this post, we will provide a quick introduction to R using the famous iris dataset.

We will cover loading data, exploring the dataset, basic data manipulation, and plotting. By the end, you should have a good understanding of how to get started with R, so read on!

This is the introduction that I wished I had back when I started analyzing data! Disclosure: part of the post and code were written with the assistance of ChatGPT-4, the concept and ideas herein are my own.

You can also watch the video for this post (in German):

First, download and install R from the Comprehensive R Archive Network (CRAN): https://cran.r-project.org/

Next, download and install RStudio, an integrated development environment (IDE) for R: https://posit.co/download/rstudio-desktop/

Ok, and now see for yourself how easy it is to analyze data with R! We will use the equivalent of “Hello World” for data science, the famous iris dataset. Of course, you can also easily load your own data into R: just click on “Import Dataset” in the “Environment” tab in the upper right window of RStudio and follow the steps from there!

The iris dataset is a classic and widely used dataset in the field of data science and machine learning. The dataset consists of 150 samples from three species of iris flowers: Iris setosa, Iris versicolor, and Iris virginica. Each sample contains four features, which are the lengths and widths of the sepals and petals (in centimeters). The dataset has become a popular choice for testing and demonstrating various data analysis and machine learning techniques due to its simplicity and well-defined structure. The iris dataset comes pre-loaded in R, so no need to import any external files.

To get an overview of the dataset, use the `head()`

and `summary()`

functions:

# Display the first 6 rows of the dataset head(iris) ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa # Get summary statistics of the dataset summary(iris) ## Sepal.Length Sepal.Width Petal.Length Petal.Width ## Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100 ## 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300 ## Median :5.800 Median :3.000 Median :4.350 Median :1.300 ## Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199 ## 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800 ## Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500 ## Species ## setosa :50 ## versicolor:50 ## virginica :50

The `summary()`

function in R provides a quick overview of the main statistical measures for numerical data. Here are short descriptions of each measure:

- Minimum (Min.): The smallest value in the dataset.
- 1st Quartile (1st Qu.): The value that separates the lowest 25% of the data from the remaining 75%; also known as the 25th percentile.
- Median (2nd Qu.): The middle value that separates the lower and upper halves of the data; also known as the 50th percentile.
- Mean: The arithmetic average of the data values, calculated by adding up all the values and dividing by the total number of values.
- 3rd Quartile (3rd Qu.): The value that separates the lowest 75% of the data from the highest 25%; also known as the 75th percentile.
- Maximum (Max.): The largest value in the dataset.

These statistical measures give you a quick snapshot of the central tendency, dispersion, and overall distribution of your numerical data.

**a) Subsetting the dataset**

To select specific columns in the dataset, use the `$`

operator or the `[]`

brackets:

# Select the Petal.Width column iris$Petal.Width ## [1] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 0.2 0.2 0.1 0.1 0.2 0.4 0.4 0.3 ## [19] 0.3 0.3 0.2 0.4 0.2 0.5 0.2 0.2 0.4 0.2 0.2 0.2 0.2 0.4 0.1 0.2 0.2 0.2 ## [37] 0.2 0.1 0.2 0.2 0.3 0.3 0.2 0.6 0.4 0.3 0.2 0.2 0.2 0.2 1.4 1.5 1.5 1.3 ## [55] 1.5 1.3 1.6 1.0 1.3 1.4 1.0 1.5 1.0 1.4 1.3 1.4 1.5 1.0 1.5 1.1 1.8 1.3 ## [73] 1.5 1.2 1.3 1.4 1.4 1.7 1.5 1.0 1.1 1.0 1.2 1.6 1.5 1.6 1.5 1.3 1.3 1.3 ## [91] 1.2 1.4 1.2 1.0 1.3 1.2 1.3 1.3 1.1 1.3 2.5 1.9 2.1 1.8 2.2 2.1 1.7 1.8 ## [109] 1.8 2.5 2.0 1.9 2.1 2.0 2.4 2.3 1.8 2.2 2.3 1.5 2.3 2.0 2.0 1.8 2.1 1.8 ## [127] 1.8 1.8 2.1 1.6 1.9 2.0 2.2 1.5 1.4 2.3 2.4 1.8 1.8 2.1 2.4 2.3 1.9 2.3 ## [145] 2.5 2.3 1.9 2.0 2.3 1.8 # Select the first three columns (the first part before the comma is for selecting the rows, the second is for the columns - if left free, nothing is filtered) iris[ , 1:3] ## Sepal.Length Sepal.Width Petal.Length ## 1 5.1 3.5 1.4 ## 2 4.9 3.0 1.4 ## 3 4.7 3.2 1.3 ## 4 4.6 3.1 1.5 ## 5 5.0 3.6 1.4 ## 6 5.4 3.9 1.7 ## 7 4.6 3.4 1.4 ## 8 5.0 3.4 1.5 ## 9 4.4 2.9 1.4 ## 10 4.9 3.1 1.5 ## 11 5.4 3.7 1.5 ## 12 4.8 3.4 1.6 ## 13 4.8 3.0 1.4 ## 14 4.3 3.0 1.1 ## 15 5.8 4.0 1.2 ## 16 5.7 4.4 1.5 ## 17 5.4 3.9 1.3 ## 18 5.1 3.5 1.4 ## 19 5.7 3.8 1.7 ## 20 5.1 3.8 1.5 ## 21 5.4 3.4 1.7 ## 22 5.1 3.7 1.5 ## 23 4.6 3.6 1.0 ## 24 5.1 3.3 1.7 ## 25 4.8 3.4 1.9 ## 26 5.0 3.0 1.6 ## 27 5.0 3.4 1.6 ## 28 5.2 3.5 1.5 ## 29 5.2 3.4 1.4 ## 30 4.7 3.2 1.6 ## 31 4.8 3.1 1.6 ## 32 5.4 3.4 1.5 ## 33 5.2 4.1 1.5 ## 34 5.5 4.2 1.4 ## 35 4.9 3.1 1.5 ## 36 5.0 3.2 1.2 ## 37 5.5 3.5 1.3 ## 38 4.9 3.6 1.4 ## 39 4.4 3.0 1.3 ## 40 5.1 3.4 1.5 ## 41 5.0 3.5 1.3 ## 42 4.5 2.3 1.3 ## 43 4.4 3.2 1.3 ## 44 5.0 3.5 1.6 ## 45 5.1 3.8 1.9 ## 46 4.8 3.0 1.4 ## 47 5.1 3.8 1.6 ## 48 4.6 3.2 1.4 ## 49 5.3 3.7 1.5 ## 50 5.0 3.3 1.4 ## 51 7.0 3.2 4.7 ## 52 6.4 3.2 4.5 ## 53 6.9 3.1 4.9 ## 54 5.5 2.3 4.0 ## 55 6.5 2.8 4.6 ## 56 5.7 2.8 4.5 ## 57 6.3 3.3 4.7 ## 58 4.9 2.4 3.3 ## 59 6.6 2.9 4.6 ## 60 5.2 2.7 3.9 ## 61 5.0 2.0 3.5 ## 62 5.9 3.0 4.2 ## 63 6.0 2.2 4.0 ## 64 6.1 2.9 4.7 ## 65 5.6 2.9 3.6 ## 66 6.7 3.1 4.4 ## 67 5.6 3.0 4.5 ## 68 5.8 2.7 4.1 ## 69 6.2 2.2 4.5 ## 70 5.6 2.5 3.9 ## 71 5.9 3.2 4.8 ## 72 6.1 2.8 4.0 ## 73 6.3 2.5 4.9 ## 74 6.1 2.8 4.7 ## 75 6.4 2.9 4.3 ## 76 6.6 3.0 4.4 ## 77 6.8 2.8 4.8 ## 78 6.7 3.0 5.0 ## 79 6.0 2.9 4.5 ## 80 5.7 2.6 3.5 ## 81 5.5 2.4 3.8 ## 82 5.5 2.4 3.7 ## 83 5.8 2.7 3.9 ## 84 6.0 2.7 5.1 ## 85 5.4 3.0 4.5 ## 86 6.0 3.4 4.5 ## 87 6.7 3.1 4.7 ## 88 6.3 2.3 4.4 ## 89 5.6 3.0 4.1 ## 90 5.5 2.5 4.0 ## 91 5.5 2.6 4.4 ## 92 6.1 3.0 4.6 ## 93 5.8 2.6 4.0 ## 94 5.0 2.3 3.3 ## 95 5.6 2.7 4.2 ## 96 5.7 3.0 4.2 ## 97 5.7 2.9 4.2 ## 98 6.2 2.9 4.3 ## 99 5.1 2.5 3.0 ## 100 5.7 2.8 4.1 ## 101 6.3 3.3 6.0 ## 102 5.8 2.7 5.1 ## 103 7.1 3.0 5.9 ## 104 6.3 2.9 5.6 ## 105 6.5 3.0 5.8 ## 106 7.6 3.0 6.6 ## 107 4.9 2.5 4.5 ## 108 7.3 2.9 6.3 ## 109 6.7 2.5 5.8 ## 110 7.2 3.6 6.1 ## 111 6.5 3.2 5.1 ## 112 6.4 2.7 5.3 ## 113 6.8 3.0 5.5 ## 114 5.7 2.5 5.0 ## 115 5.8 2.8 5.1 ## 116 6.4 3.2 5.3 ## 117 6.5 3.0 5.5 ## 118 7.7 3.8 6.7 ## 119 7.7 2.6 6.9 ## 120 6.0 2.2 5.0 ## 121 6.9 3.2 5.7 ## 122 5.6 2.8 4.9 ## 123 7.7 2.8 6.7 ## 124 6.3 2.7 4.9 ## 125 6.7 3.3 5.7 ## 126 7.2 3.2 6.0 ## 127 6.2 2.8 4.8 ## 128 6.1 3.0 4.9 ## 129 6.4 2.8 5.6 ## 130 7.2 3.0 5.8 ## 131 7.4 2.8 6.1 ## 132 7.9 3.8 6.4 ## 133 6.4 2.8 5.6 ## 134 6.3 2.8 5.1 ## 135 6.1 2.6 5.6 ## 136 7.7 3.0 6.1 ## 137 6.3 3.4 5.6 ## 138 6.4 3.1 5.5 ## 139 6.0 3.0 4.8 ## 140 6.9 3.1 5.4 ## 141 6.7 3.1 5.6 ## 142 6.9 3.1 5.1 ## 143 5.8 2.7 5.1 ## 144 6.8 3.2 5.9 ## 145 6.7 3.3 5.7 ## 146 6.7 3.0 5.2 ## 147 6.3 2.5 5.0 ## 148 6.5 3.0 5.2 ## 149 6.2 3.4 5.4 ## 150 5.9 3.0 5.1

**b) Filtering the dataset**

To filter the dataset based on a condition, use the `subset()`

function:

# Select rows where Species is "setosa" subset(iris, Species == "setosa") ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa ## 7 4.6 3.4 1.4 0.3 setosa ## 8 5.0 3.4 1.5 0.2 setosa ## 9 4.4 2.9 1.4 0.2 setosa ## 10 4.9 3.1 1.5 0.1 setosa ## 11 5.4 3.7 1.5 0.2 setosa ## 12 4.8 3.4 1.6 0.2 setosa ## 13 4.8 3.0 1.4 0.1 setosa ## 14 4.3 3.0 1.1 0.1 setosa ## 15 5.8 4.0 1.2 0.2 setosa ## 16 5.7 4.4 1.5 0.4 setosa ## 17 5.4 3.9 1.3 0.4 setosa ## 18 5.1 3.5 1.4 0.3 setosa ## 19 5.7 3.8 1.7 0.3 setosa ## 20 5.1 3.8 1.5 0.3 setosa ## 21 5.4 3.4 1.7 0.2 setosa ## 22 5.1 3.7 1.5 0.4 setosa ## 23 4.6 3.6 1.0 0.2 setosa ## 24 5.1 3.3 1.7 0.5 setosa ## 25 4.8 3.4 1.9 0.2 setosa ## 26 5.0 3.0 1.6 0.2 setosa ## 27 5.0 3.4 1.6 0.4 setosa ## 28 5.2 3.5 1.5 0.2 setosa ## 29 5.2 3.4 1.4 0.2 setosa ## 30 4.7 3.2 1.6 0.2 setosa ## 31 4.8 3.1 1.6 0.2 setosa ## 32 5.4 3.4 1.5 0.4 setosa ## 33 5.2 4.1 1.5 0.1 setosa ## 34 5.5 4.2 1.4 0.2 setosa ## 35 4.9 3.1 1.5 0.2 setosa ## 36 5.0 3.2 1.2 0.2 setosa ## 37 5.5 3.5 1.3 0.2 setosa ## 38 4.9 3.6 1.4 0.1 setosa ## 39 4.4 3.0 1.3 0.2 setosa ## 40 5.1 3.4 1.5 0.2 setosa ## 41 5.0 3.5 1.3 0.3 setosa ## 42 4.5 2.3 1.3 0.3 setosa ## 43 4.4 3.2 1.3 0.2 setosa ## 44 5.0 3.5 1.6 0.6 setosa ## 45 5.1 3.8 1.9 0.4 setosa ## 46 4.8 3.0 1.4 0.3 setosa ## 47 5.1 3.8 1.6 0.2 setosa ## 48 4.6 3.2 1.4 0.2 setosa ## 49 5.3 3.7 1.5 0.2 setosa ## 50 5.0 3.3 1.4 0.2 setosa

**c) Sorting the dataset**

To sort the dataset by a specific column, use the `order()`

function. We can combine that with selecting only certain columns:

# Sort the dataset by Petal.Width in ascending order and show columns Petal.Width and Species iris[order(iris$Petal.Width), c("Petal.Width", "Species")] ## Petal.Width Species ## 10 0.1 setosa ## 13 0.1 setosa ## 14 0.1 setosa ## 33 0.1 setosa ## 38 0.1 setosa ## 1 0.2 setosa ## 2 0.2 setosa ## 3 0.2 setosa ## 4 0.2 setosa ## 5 0.2 setosa ## 8 0.2 setosa ## 9 0.2 setosa ## 11 0.2 setosa ## 12 0.2 setosa ## 15 0.2 setosa ## 21 0.2 setosa ## 23 0.2 setosa ## 25 0.2 setosa ## 26 0.2 setosa ## 28 0.2 setosa ## 29 0.2 setosa ## 30 0.2 setosa ## 31 0.2 setosa ## 34 0.2 setosa ## 35 0.2 setosa ## 36 0.2 setosa ## 37 0.2 setosa ## 39 0.2 setosa ## 40 0.2 setosa ## 43 0.2 setosa ## 47 0.2 setosa ## 48 0.2 setosa ## 49 0.2 setosa ## 50 0.2 setosa ## 7 0.3 setosa ## 18 0.3 setosa ## 19 0.3 setosa ## 20 0.3 setosa ## 41 0.3 setosa ## 42 0.3 setosa ## 46 0.3 setosa ## 6 0.4 setosa ## 16 0.4 setosa ## 17 0.4 setosa ## 22 0.4 setosa ## 27 0.4 setosa ## 32 0.4 setosa ## 45 0.4 setosa ## 24 0.5 setosa ## 44 0.6 setosa ## 58 1.0 versicolor ## 61 1.0 versicolor ## 63 1.0 versicolor ## 68 1.0 versicolor ## 80 1.0 versicolor ## 82 1.0 versicolor ## 94 1.0 versicolor ## 70 1.1 versicolor ## 81 1.1 versicolor ## 99 1.1 versicolor ## 74 1.2 versicolor ## 83 1.2 versicolor ## 91 1.2 versicolor ## 93 1.2 versicolor ## 96 1.2 versicolor ## 54 1.3 versicolor ## 56 1.3 versicolor ## 59 1.3 versicolor ## 65 1.3 versicolor ## 72 1.3 versicolor ## 75 1.3 versicolor ## 88 1.3 versicolor ## 89 1.3 versicolor ## 90 1.3 versicolor ## 95 1.3 versicolor ## 97 1.3 versicolor ## 98 1.3 versicolor ## 100 1.3 versicolor ## 51 1.4 versicolor ## 60 1.4 versicolor ## 64 1.4 versicolor ## 66 1.4 versicolor ## 76 1.4 versicolor ## 77 1.4 versicolor ## 92 1.4 versicolor ## 135 1.4 virginica ## 52 1.5 versicolor ## 53 1.5 versicolor ## 55 1.5 versicolor ## 62 1.5 versicolor ## 67 1.5 versicolor ## 69 1.5 versicolor ## 73 1.5 versicolor ## 79 1.5 versicolor ## 85 1.5 versicolor ## 87 1.5 versicolor ## 120 1.5 virginica ## 134 1.5 virginica ## 57 1.6 versicolor ## 84 1.6 versicolor ## 86 1.6 versicolor ## 130 1.6 virginica ## 78 1.7 versicolor ## 107 1.7 virginica ## 71 1.8 versicolor ## 104 1.8 virginica ## 108 1.8 virginica ## 109 1.8 virginica ## 117 1.8 virginica ## 124 1.8 virginica ## 126 1.8 virginica ## 127 1.8 virginica ## 128 1.8 virginica ## 138 1.8 virginica ## 139 1.8 virginica ## 150 1.8 virginica ## 102 1.9 virginica ## 112 1.9 virginica ## 131 1.9 virginica ## 143 1.9 virginica ## 147 1.9 virginica ## 111 2.0 virginica ## 114 2.0 virginica ## 122 2.0 virginica ## 123 2.0 virginica ## 132 2.0 virginica ## 148 2.0 virginica ## 103 2.1 virginica ## 106 2.1 virginica ## 113 2.1 virginica ## 125 2.1 virginica ## 129 2.1 virginica ## 140 2.1 virginica ## 105 2.2 virginica ## 118 2.2 virginica ## 133 2.2 virginica ## 116 2.3 virginica ## 119 2.3 virginica ## 121 2.3 virginica ## 136 2.3 virginica ## 142 2.3 virginica ## 144 2.3 virginica ## 146 2.3 virginica ## 149 2.3 virginica ## 115 2.4 virginica ## 137 2.4 virginica ## 141 2.4 virginica ## 101 2.5 virginica ## 110 2.5 virginica ## 145 2.5 virginica

As can be seen, petal width is pretty good at separating the different species. We will corroborate this with some basic plotting.

R has built-in plotting functions for creating simple visualizations. Here are a few examples:

**a) Histogram**

A histogram is a graphical representation of the distribution of a dataset, where data is divided into a set of intervals or bins. The data is represented as vertical bars, with the height of each bar corresponding to the number of data points that fall within a particular bin. Histograms are used to visualize the underlying frequency distribution of a continuous variable, allowing one to identify patterns such as skewness, central tendency, and dispersion.

# Create a histogram of Petal.Width hist(iris$Petal.Width, main = "Histogram of Petal Width", xlab = "Petal Width", ylab = "Frequency", col = "lightblue", border = "black")

**b) Box plot**

A box plot, also known as a box-and-whisker plot, is a graphical representation of the distribution and spread of a dataset. It displays five key statistics: the minimum, first quartile (Q1), median (Q2), third quartile (Q3), and maximum. The “box” represents the interquartile range (IQR), which contains the middle 50% of the data, while the “whiskers” extend from the box to the minimum and maximum values. Outliers, if present, are typically represented as individual points outside the whiskers.

# Create a box plot of Petal.Width by Species boxplot(Petal.Width ~ Species, data = iris, main = "Box plot of Petal Width by Species", xlab = "Species", ylab = "Petal Width")

Here, we can very clearly see that petal width indeed separates the different species well. In the following plot, another variable, petal length, is included and the three species are colour-coded.

**c) Scatter plot**

A scatter plot is a graphical representation of the relationship between two variables, where each data point is represented by a dot on a two-dimensional plane. The horizontal axis (x-axis) represents one variable, while the vertical axis (y-axis) represents the other variable. By analyzing the distribution and pattern of the dots, one can determine the correlation, trends, or outliers between the two variables.

# Create a scatter plot of Petal.Length vs Petal.Width and colour-code Species plot(iris$Petal.Length, iris$Petal.Width, main = "Scatter plot of Petal Length vs Petal Width", xlab = "Petal Length", ylab = "Petal Width", col = iris$Species)

And, by the way, you can very easily use those plots in other applications (like WinWord or PowerPoint) by clicking on “Export” in Rstudio and then on “Save as Image…” or “Copy to Clipboard…”.

In just 10 minutes, you’ve learned the basics of R using the iris dataset. We covered loading data, data manipulation, and basic plotting. As you continue to explore R, you will discover its vast capabilities and potential for analyzing and visualizing complex data.

To continue on your coding adventure, the following posts are good starting points:

- Learning R: The Ultimate Introduction (incl. Machine Learning!)
- One Rule (OneR) Machine Learning Classification in under One Minute

If you want to dive deeper into data science, I created the following learning path: Learning Path for “Data Science with R” – Part I

Take care, and happy data sleuthing!

]]>This post presents a real highlight: We will build and backtest a quantitative trading strategy in R with the help of OpenAI’s ChatGPT-4! If you want to get a glimpse into the future of trading system development, read on!

On this blog, I already provided a template to build your own trading system (see Backtest Trading Strategies Like a Real Quant) but back in the days we did everything by hand. Not anymore! AI to the rescue! Also on this blog, we very early on did some coding experiments with ChatGPT (see ChatGPT can Create Datasets, Program in R… and when it makes an Error it can Fix that too!).

So now, I will present to you a full session with ChatGPT-4, including the used prompts, in which we build and backtest a trading strategy together, so buckle up!

You can also watch the video for this post (in German):

I tried different approaches and different prompts. As always with Large Language Models (LLMs) like ChatGPT you get the best results when you are as specific as possible. I also found out that it is best to give it some seed idea where to start, otherwise you will most often only end up with standard stuff like moving average crossovers or something like that. Because I am a big fan of risk-based investing, as I call it, I started with this prompt and had the following exchange with the AI (outputs of the code after running it are copied directly into the respective sections, both packages can be found on `CRAN`

, if you want to run the code yourself):

You are a renowned quantitative finance investor. You want to create a new trading system. The idea is to have a long/flat strategy on the S&P 500 based on the volatility as a stress and risk indicator. Do everything step by step and explain your reasoning behind it. Be extra careful to avoid data snooping bias, survivorship bias, look ahead bias and overfitting. Use R for backtesting, but don’t use the tidyverse. Load the necessary data with the

`quantmod`

package. The calculation of the performance and risk metrics should be done with the`PerformanceAnalytics`

package. Give me the R code in one code window. I will give you the results of the tests and we will refine it together iteratively.

To create a long/flat trading system on the S&P 500 based on volatility, we’ll follow these steps:

- Load necessary libraries and data
- Calculate volatility as a risk indicator
- Develop the trading strategy
- Backtest the strategy
- Evaluate performance and risk metrics

Here is the R code to accomplish these tasks:

# Step 1: Load necessary libraries and data library(quantmod) library(PerformanceAnalytics) start_date <- as.Date("2000-01-01") end_date <- as.Date("2021-12-31") symbol <- "^GSPC" # S&P 500 symbol getSymbols(symbol, src = "yahoo", from = start_date, to = end_date) price_data <- Ad(get(symbol)) # Step 2: Calculate volatility as a risk indicator lookback_period <- 20 volatility <- runSD(ROC(price_data, n = 1, type = "discrete"), n = lookback_period) * sqrt(252) # Step 3: Develop the trading strategy threshold <- 0.15 signal <- ifelse(volatility < threshold, 1, 0) signal <- lag(signal, 1) # To avoid look-ahead bias signal[is.na(signal)] <- 0 # Step 4: Backtest the strategy returns <- ROC(price_data, n = 1, type = "discrete") * signal strategy_returns <- na.omit(returns) # Step 5: Evaluate performance and risk metrics charts.PerformanceSummary(strategy_returns, main = "Long/Flat Strategy based on Volatility") table.AnnualizedReturns(strategy_returns) maxDrawdown(strategy_returns)

Please run this R code, provide the results of the backtest, and we will refine the strategy iteratively based on the results.

`It says "> price_data <- Ad(get(symbol))`

Error in get(symbol) : object '^GSPC' not found"

Apologies for the confusion. It seems the variable name is causing issues. Let’s change the way we reference the data. Here’s the modified R code:

# Step 1: Load necessary libraries and data library(quantmod) library(PerformanceAnalytics) start_date <- as.Date("2000-01-01") end_date <- as.Date("2021-12-31") symbol <- "^GSPC" # S&P 500 symbol getSymbols(symbol, src = "yahoo", from = start_date, to = end_date) price_data <- Ad(get(symbol)) # Step 2: Calculate volatility as a risk indicator lookback_period <- 20 volatility <- runSD(ROC(price_data, n = 1, type = "discrete"), n = lookback_period) * sqrt(252) # Step 3: Develop the trading strategy threshold <- 0.15 signal <- ifelse(volatility < threshold, 1, 0) signal <- lag(signal, 1) # To avoid look-ahead bias signal[is.na(signal)] <- 0 # Step 4: Backtest the strategy returns <- ROC(price_data, n = 1, type = "discrete") * signal strategy_returns <- na.omit(returns) # Step 5: Evaluate performance and risk metrics charts.PerformanceSummary(strategy_returns, main = "Long/Flat Strategy based on Volatility")

table.AnnualizedReturns(strategy_returns) ## GSPC.Adjusted ## Annualized Return 0.0334 ## Annualized Std Dev 0.0875 ## Annualized Sharpe (Rf=0%) 0.3821 maxDrawdown(strategy_returns) ## [1] 0.2954578

Please run this updated R code, and provide the results of the backtest. We will refine the strategy iteratively based on the results.

Please modify the code so that one can compare the strategy with the underlying benchmark

Sure, I have modified the code to include the benchmark (S&P 500) performance for comparison:

# Step 1: Load necessary libraries and data library(quantmod) library(PerformanceAnalytics) start_date <- as.Date("2000-01-01") end_date <- as.Date("2021-12-31") symbol <- "^GSPC" # S&P 500 symbol getSymbols(symbol, src = "yahoo", from = start_date, to = end_date, auto.assign = FALSE) -> sp500_data price_data <- Ad(sp500_data) # Step 2: Calculate volatility as a risk indicator lookback_period <- 20 volatility <- runSD(ROC(price_data, n = 1, type = "discrete"), n = lookback_period) * sqrt(252) # Step 3: Develop the trading strategy threshold <- 0.15 signal <- ifelse(volatility < threshold, 1, 0) signal <- lag(signal, 1) # To avoid look-ahead bias signal[is.na(signal)] <- 0 # Step 4: Backtest the strategy returns <- ROC(price_data, n = 1, type = "discrete") * signal strategy_returns <- na.omit(returns) # Calculate benchmark returns benchmark_returns <- ROC(price_data, n = 1, type = "discrete") benchmark_returns <- na.omit(benchmark_returns) # Step 5: Evaluate performance and risk metrics comparison <- cbind(strategy_returns, benchmark_returns) colnames(comparison) <- c("Strategy", "Benchmark") charts.PerformanceSummary(comparison, main = "Long/Flat Strategy vs S&P 500 Benchmark")

table.AnnualizedReturns(comparison) ## Strategy Benchmark ## Annualized Return 0.0334 0.0556 ## Annualized Std Dev 0.0875 0.1964 ## Annualized Sharpe (Rf=0%) 0.3821 0.2833 maxDrawdown(comparison) ## Strategy Benchmark ## Worst Drawdown 0.2954578 0.5677539

This code will plot the performance of both the long/flat strategy based on volatility and the S&P 500 benchmark, as well as display the annualized returns and maximum drawdown for each. Please run this updated R code and provide the results of the backtest. We will refine the strategy iteratively based on the results.

Here are the metrics:

table.AnnualizedReturns(comparison)

## Strategy Benchmark

## Annualized Return 0.0334 0.0556

## Annualized Std Dev 0.0875 0.1964

## Annualized Sharpe (Rf=0%) 0.3821 0.2833

maxDrawdown(comparison)

## Strategy Benchmark

## Worst Drawdown 0.2954578 0.5677539

Based on the metrics provided, the long/flat strategy has a lower annualized return (3.34%) compared to the S&P 500 benchmark (5.56%). However, the long/flat strategy has a lower annualized standard deviation (8.75%) than the benchmark (19.64%), indicating that it is less volatile. The Sharpe ratio (with risk-free rate = 0%) is higher for the long/flat strategy (0.3821) than the benchmark (0.2833), suggesting that the strategy has better risk-adjusted returns. Additionally, the maximum drawdown of the long/flat strategy (29.55%) is significantly lower than that of the benchmark (56.78%).

Depending on your investment objectives and risk tolerance, the long/flat strategy might be an attractive option due to its lower volatility and better risk-adjusted returns, despite having a lower annualized return compared to the S&P 500 benchmark.

If you would like to further refine the strategy, you may consider adjusting the lookback period, the volatility threshold, or incorporating additional indicators to improve the strategy’s performance. Please let me know if you’d like to explore further improvements or adjustments to the strategy.

How did you arrive at the lookback period (20) and the volatility threshhold (0.15)?

The lookback period of 20 and the volatility threshold of 0.15 were arbitrary choices for the initial version of the strategy. In practice, these parameters should be optimized to achieve better performance while ensuring the strategy remains robust and doesn’t overfit the data.

One approach to optimizing these parameters is to perform a parameter sweep, in which you systematically vary the lookback period and the volatility threshold within reasonable ranges and evaluate the strategy’s performance for each combination. This can help you identify the parameter combination that yields the best performance.

However, it’s important to avoid overfitting, which occurs when the strategy is tailored too closely to the historical data, leading to poor performance when applied to unseen data. To mitigate overfitting, you can use techniques like out-of-sample testing and cross-validation.

What could be the economic reason that this strategy outperforms the benchmark on a risk-adjusted basis?

The economic rationale behind the long/flat strategy outperforming the benchmark on a risk-adjusted basis can be attributed to the strategy’s ability to reduce exposure to market risk during periods of high volatility. High volatility is often associated with increased uncertainty, stress, and risk in financial markets. By going flat (i.e., not holding any position) during these periods, the strategy effectively manages risk and limits potential losses.

A few key points provide insights into why this approach works:

- Market inefficiencies: Financial markets are not always perfectly efficient. During periods of high volatility, market participants may react emotionally, leading to overreactions or panic selling. This can create price distortions and increase the risk of holding equities. By reducing exposure during high volatility, the long/flat strategy can avoid some of these potential losses.
- Risk aversion: When volatility is high, investors tend to become more risk-averse and prioritize preserving capital over seeking returns. This risk-averse behavior can lead to a preference for safer assets, causing a decline in equity prices. By going flat during these periods, the strategy aligns with the risk-averse behavior of market participants.
- Market turbulence: High volatility often indicates market turbulence or significant events (e.g., geopolitical risks, economic crises, or natural disasters) that can negatively impact the overall market or specific industries. By limiting exposure during periods of high volatility, the long/flat strategy helps mitigate the potential negative impact of such events.
- Diversification benefits: A long/flat strategy can provide diversification benefits to an investment portfolio. The strategy may exhibit low correlation with other assets or strategies, thereby enhancing the overall risk-adjusted performance of the portfolio.

By managing risk and reducing exposure during high-volatility periods, the long/flat strategy seeks to deliver better risk-adjusted returns compared to the benchmark. However, it’s important to keep in mind that past performance does not guarantee future results, and the strategy’s effectiveness may vary across different market conditions and time periods.

I could have gone on to further refine the strategy, optimize the parameters while avoiding overfitting, add more parameters, and so on, but this post is for illustrative purposes only. What I did though is to test the strategy out-of-sample for the year 2022 and Q1 2023. This is what I got:

table.AnnualizedReturns(comparison) ## Strategy Benchmark ## Annualized Return -0.0145 -0.1280 ## Annualized Std Dev 0.0261 0.2294 ## Annualized Sharpe (Rf=0%) -0.5574 -0.5579 maxDrawdown(comparison) ## Strategy Benchmark ## Worst Drawdown 0.03350335 0.254251

I think the chart and the metrics speak for themselves! Quite impressive, don’t you think?

In the near future, it is conceivable that you can connect ChatGPT to R and the internet and everything would be done autonomously. There are already projects under way (e.g. “Auto-GPT”) that build the necessary infrastructure, so this will be only a matter of some short time horizon before we will have fully automated trading AIs in the markets (if they aren’t there already), constantly refining their strategies!

Please let me know your thoughts in the comments below!

**DISCLAIMER**

*This post is written on an “as is” basis for educational purposes only, can contain errors, and comes without any warranty. The findings and interpretations are exclusively those of the author and are not endorsed by or affiliated with any third party.*

In particular, this post provides no investment advice! No responsibility is taken whatsoever if you lose money.

*(If you make any money though I would be happy if you would buy me a coffee… that is not too much to ask, is it? 😉 )*