Researching “Safe” Withdrawal Rates? Double-check your standard errrors.
TL;DR. Your historical data gives you (best-case) an effective sample size of roughly one, invalidating most of your inferences.
Looking for the statistics? We’ll get there, but first a little background.
What’s a Safe Withdrawal Rate and why does it matter?
It’s often easy to overlook it, but we are fortunate to live in a time of unprecedented global prosperity. For the past century or so it has become possible, in some countries even typical, that people do not continue working until the day they die, but that they “retire” from the workforce and sustain themselves via a combination of a employer pensions, social security, and own savings.
This combination — of employer pension, social security, and own savings — is often referred to as the “three-legged stool” of retirement planning. This three-legged stool is common for the boomer generation (at least in the U.S.), and was even more so for their parents, but for many people in many places, employer pensions are now a rarity and state-sponsored social security payments look increasingly meager, making it often look like we are sitting on a two- or even one-legged stool. (Perhaps prompting some of us to realize the alternative meanings of the word “stool”.)
Beyond the fact that this leaves many people with a (much) smaller source of retirement funds, it also presents them with an extremely difficult problem that most of us are ill-prepared to handle. Unlike pensions and social security, which arrive every month just in time to pay credit card bills, groceries, and mortgages, our savings are just one big pile of… money. And the problem people are faced with is: “How much of my savings should I spend this week/month/year?” Academics refer to this class of problem as “optimal control problems” and without many simplifying assumptions they can be difficult if not impossible to solve.
However, for tens if not hundreds of millions of people, this is not just an academic question. Spend too much today and you may end up “spending” your last years in poverty. Spend too little and you may unnecessarily deprive yourself of some of the pleasures of life. To help people with this problem, a large and growing literature has sprung up to come up with “Safe Withdrawal Rates”, i.e., rules-of-thumb for people to use to solve the problem of “How much of my savings should I spend this week/month/year?”. One of the most popular of these at the moment is the “4% Rule”, which roughly states that you should be okay if you spend 4% of your savings each year.
How safe is a “Safe Withdrawal Rate”? Enter: The “Failure Rates”
Much of the research around Safe Withdrawal Rates, including the often-cited “Trinity Study” involves the calculation of the “Failure Rates” (among pessimistic readers/authors) also known as the “Success Rates” (among optimistic readers/authors), associated with various withdrawal rates. These are usually defined mathematically as the probability, conditional on using some withdrawal rate/rule, that a retiree exhausts their savings before dying.
50 Years of Historical Data Should Be Enough, Right? Wrong.
I’ve spent most of my life dealing with statistics (along with “lies” and “damned lies”), so I’ve been b̶r̶a̶i̶n̶w̶a̶s̶h̶e̶d conditioned to think about everything in terms of “standard errors”. Basically this is just the +/- “range of uncertainty” that surrounds every estimate from “I’ll be there in 15 minutes.” to “No, I don’t have the coronavirus”.
My first criticism of effectively all of the research on Safe Withdrawal Rates I have come across thus far [COUNTER-EXAMPLES WELCOME!!!] is that there are no estimates of the accuracy (i.e., the standard errors) of the success/failure rates. For example, Table 2 of the Trinity Study reports that the success rate is 48% when using a payout period of 30 years, a portfolio of 100% bonds, and a Withdrawal Rate of 5%. But if we want to take that in-sample 48% and apply it out-of-sample, we need to recognize that it is only an estimate, and IMNSHO any researcher who takes themselves seriously should provide an estimate of the standard errors (i.e., the range of uncertainty) for any estimate they provide. So rather than reporting just 48%, they should report (say) 48% +/- 10%, where this range corresponds to some specified “confidence interval”.
My second criticism of literally all of the research on Safe Withdrawal Rates I have come across thus far [again, COUNTER-EXAMPLES WELCOME!!!] is that they are based on (mostly) unstated assumptions which (again, IMNSHO) are simply naïve: an assumption that returns are, to varying degrees, independently and identically distributed over time, and free from survivorship/observation bias. To understand why this is dangerous, I first need to give you two examples.
First example: A Year of the SPX
(Lots of data doesn’t necessarily mean lots of relevant information)
First, suppose that I give you a year (a whole year!) of minute-by-minute historical data on the S&P 500 Index. That’s six and a half hours per day, for 253 days: almost 100,000 data points! That’s a lot of data, right? (Indeed, this is even more rows of data than could fit in an Excel 2003 spreadsheet.) So what’s my expected return for the following year? There are many possible answers to answer this question, including:
- High-School Stats Student Answer:
exp(average-of-log-of-one-minute-return* 253 days * 6.5 hours/day * 60 minutes/hour)-1
- Finance Student Answer:
the same as this year
- Bayesian Finance Answer:
Following my first criticism, we can also report the standard errors for each of these answers:
- High-School Stats Student Answer:
standard error of average-of-log-of-one-minute-return *
sqrt(253 days * 6.5 hours/day * 60 minutes/hour)
- Finance Student Answer:
Not defined (or infinity, if you prefer)
- Bayesian Finance Answer:
Depends on my prior (i.e., whatever I want it to be)
First, note that all of these answers are potentially correct, they just involved various unstated assumptions. What’s perhaps “wrong” with the High-School Stats Student Answer? It implicitly assumes that the data generating process over a one-minute investment horizon is the same as the data generating process over a one-year investment horizon (i.e., that the returns are i.i.d.). Our hypothetical Finance Student doesn’t think this is a reasonable assumption — perhaps aided by the fact that I’ve made the example rather egregious/obvious by looking at minutes rather than, say, months. (Yes, I’m looking at you Gene Fama and Ken French.) A potential issue with the “Finance Student Answer” becomes apparent if I reveal to you that I’ve given you the returns for 2008 (i.e., that your in-sample may not be as representative of your out-of-sample as you think).
Second example: Birthday cake
(Care is needed when extrapolating in-sample to out-of-sample)
Suppose that I’m conducting research on cake-satisfaction among five-year olds, and that I have a large birthday cake, from which I’m magically able to sample with replacement. With some deft spatula usage, I’m able to generate Monte Carlo bootstrapped birthday cakes to offer to various test groups of five-year olds. Just to be careful, I also create some overlapping samples of subsets of the cake (sampling without replacement). Based on my research, I am able to compute all sorts of statistics regarding how satisfied my test groups are with the various bootstrapped/sampled cakes.
The problem here is that I am only sampling/resampling from a single birthday cake, and I could draw totally different inferences depending on whether the cake was chocolate-flavored or truffle-flavored. Just like in the S&P 500 example above, I should consider the fact that the sample I have may not be fully representative of all cakes/years.
What on earth does this have to do with investing, let alone Safe Withdrawal Rates? The historical returns we have observed over the past 30, 50, 100 years are like the birthday cake. We only have the one cake. We don’t know what the other cakes might be like. Well, we do a little bit. Most analyses focus on U.S. data. Let’s qualitatively consider how other countries fared in the 20th century: investors in France, Germany, Italy, Japan, Iran, Afghanistan, China, Korea, Russia, indeed most of the world’s population would have lost (at some point) most if not all of their savings. U.S.-based investors have been extraordinarily lucky for effectively all of the modern era of investing, and it is naïve to think that this will be repeated. Not only do we only have a single observation to work with, that single observation is likely not particularly informative.
So… WTF to do now?
First, if you’re trying to figure out a Safe Withdrawal Rate for yourself, don’t panic. Much of the research out there is informative. It just needs to be taken with a grain of salt. There is no certainty, there are no perfect rules/answers, and applying a good Withdrawal Rule is a lot more like driving a car (constantly looking around, maybe making adjustments) than it is following a (birthday) cake recipe.
Second, I highly recommend reading (global) history. A long-time favorite of mine is “The Pessimist’s Guide to History” by the Flexners, but there are many great sources.
Third, if you’re among the crowd of folks doing research on Safe Withdrawal Rates, I would urge you to please take a more global/Bayesian approach to your probabilities/approach and more explicitly acknowledge that the returns of the past century offer few effective observations, and simultaneously suffer from serious survivorship bias (particularly the U.S.).
Disagree? Agree? Think I’m overlooking something? Please let me know in the comments section below!
Note: Many thanks to a number of friends who read drafts of this article and generously provided very helpful feedback.