Essay 4
The Forecaster You Already Are
How to get better at the predictions you're already making all day.
You wake up, look out the window, and predict the weather. You guess what mood your partner is in from the noise they made when they got up. You estimate, without thinking, how long it will take to get to the bathroom, then how long it will take to get from there to the kitchen. You read a half-line of a message from your colleague and forecast how the rest of the workday is going to go. Before you’ve had any coffee, before you’ve done anything that feels like thinking, you have made a few dozen small predictions, half of which you’ve already partly forgotten, almost all of which are revising themselves quietly as new information comes in.
A team of researchers (the psychologists Martin Seligman and Roy Baumeister, the philosopher Peter Railton, the neuroscientist Chandra Sripada) has argued, fairly recently, that this is not a side activity of the human mind. It is the mind. They renamed our species, half-jokingly, homo prospectus, the future-oriented animal. The brain, in their account, is a prediction machine that runs all the time, mostly under the surface of awareness, and the conscious mind is something like a status panel that reads out a small fraction of what’s already been forecast.
This is not how popular writing on prediction usually talks about prediction. Popular writing tends to focus on what experts do when forecasting big public events: elections, recessions, geopolitical crises. Those are interesting questions and we’ll come back to them. But they are not, for most people, the useful questions. The useful questions are about the prediction you’re doing right now, often without noticing, about your own life. About whether you’ll be glad you took the job. About whether the renovation will really cost what you’ve budgeted. About whether you’ll like the new city. About whether your friend is upset with you or just tired.
It turns out there is a real literature on these. It is more honest, in places, than the geopolitical kind. It is also a lot more humbling, because the systematic errors it documents are ones that we all make every day, mostly without realizing it.
The single most replicated finding about forecasting your own life
Two researchers, Tim Wilson and Dan Gilbert, spent most of the 1990s and 2000s on a deceptively simple question: how good are people at predicting how they will feel about future events?
The framing matters. They weren’t asking whether people could predict the events themselves, whether you’d get the promotion or the team would win or the marriage would last. They were asking the next-step question: assuming the event happens, how strong and how long-lasting will your emotional reaction be?
The answer turned out to be that almost everybody, across almost every kind of event, is wrong in the same direction. They called the error the impact bias, and what it says is this: people systematically overestimate how strongly they will feel about future events, and how long the feeling will last. Both positive and negative.
The promotion, when it actually comes, doesn’t make you as durably happy as you imagined it would. The breakup doesn’t make you as durably miserable. The new car you wanted for two years gives you about a month of glow and then becomes background. The diagnosis that terrified you, after you adjust to it, takes up a smaller and smaller fraction of your waking attention, sometimes much faster than you thought it would. Across dozens of studies of lottery winners, accident survivors, election losers, tenure denials, sports fans, students starting and ending relationships, the same pattern shows up: the events matter less, and for less time, than the people involved imagined they would in advance.
Wilson and Gilbert have a name for what’s doing the work here. They call it the psychological immune system: a set of largely unconscious processes by which the mind recasts, reframes, normalizes, and finds the upside in difficult events, and by which it also quietly absorbs and habituates to the rewarding ones. The system is reliable, fast, and almost completely invisible from inside. And here’s the punchline: we forget that it exists when we forecast. We imagine our future feelings as if they will happen to a defenseless version of us, without the immune system kicking in. They called this oversight immune neglect.
What this means in practice is small and large at once. Most of the big life decisions that feel high-stakes (taking the job, leaving the city, ending the relationship, starting the new business) are being made with a hot, vivid, emotionally amplified forecast of what life will be like after. That forecast is, on average, wrong in a specific direction: it overweights the event itself and underweights everything else about your life that will continue to be true after the event happens. The job will be a job. The new city will become normal. The new relationship will be a relationship.
One small, surprising finding from this body of research is worth holding onto, because it has a practical fix. People are bad at simulating their own future feelings; they are much less bad at predicting how they would feel if they could just ask someone currently in the situation. The technical name for this is surrogation: ask the person who’s already living the thing you’re considering, not what they say about it but how they actually feel day to day, and weight their answer over your own imagination. The catch is that we don’t like doing this. We like our own forecast. The imagined version is more vivid, more personal, more ours. The surrogate’s report is more accurate.
”How sure are you?” as a daily habit
Around the same time the affective-forecasting research was building, a former professional poker player named Annie Duke was writing about a different angle on the same problem. Duke had spent two decades at high-stakes poker tables, where every decision is a bet on a probabilistic outcome and the outcome is visible within minutes. After she left poker she started writing about what the rest of us could learn from how poker players think, and her book Thinking in Bets turned into one of the few popular treatments of decision-making that doesn’t oversimplify the math.
The core move she advocates is simple to state, and harder to actually do: attach a probability to your claims.
Most people, most of the time, talk about the future in categories. Probably. Likely. No way. I doubt it. I’m pretty sure. These categories sound precise but, when you measure them, turn out to have wildly different meanings to different people, and even to the same person on different days. Probably might mean 60% to you and 85% to me. Two intelligent colleagues can agree that an outcome is likely and be talking past each other by a factor of three. When the outcome is decided, neither of them really knows whether they were right.
The bet reframe is to make yourself say a number. I think there’s a 70% chance this candidate works out. I think there’s a 30% chance we’ll hit the launch deadline. I’d say 85% chance this dish is fine to eat by tomorrow. The number is uncomfortable; it commits you to something, and that’s the whole point. Likely you can walk back. 70% you cannot.
The discomfort produces two real benefits. The first is that, over time, you can find out whether the number was any good. If you said 70% to fifteen things and ten of them happened, your 70% is well-calibrated; if only four of them happened, it isn’t. This kind of feedback is the only path to actually getting better at forecasting, and it requires you to commit to a number you can be wrong about. The second benefit, and Duke makes a lot of this one, is that the number protects you from a specific recurring mistake she calls resulting: judging a decision purely on how it turned out, rather than on whether it was a sensible bet at the time you made it. You can lose money on a poker hand where you played it perfectly because the cards just didn’t come, and you can win on a hand you played badly. Resulting is what happens when you confuse the two. Bad decisions sometimes work out; good decisions sometimes don’t. Knowing which is which requires knowing what the bet actually was.
This is also, by the way, the move that should be in your back pocket every time the popular discourse turns to “well, X happened, so Y must have been the right call.” X may have happened despite Y, not because of Y. Or because of luck. The question is whether the probability of X was raised by Y at the time of the decision, which requires knowing what you thought the probability was at the time.
A small numeracy trick that almost no one teaches
The third piece of the everyday-forecasting puzzle is older than the others and has been quietly transforming medical communication, courtrooms, and increasingly public-health messaging. It has barely made the leap into general writing on judgment.
A German psychologist named Gerd Gigerenzer, who has spent his career on the bounded rationality tradition (heuristics as adaptive tools, not bugs), discovered something in the 1990s that he could not stop talking about: people get much better at probability problems when the problem is restated in a different but equivalent form.
Here is the classic example, slightly cleaned up. Suppose a test for a rare disease is 99% accurate in both directions: a sick person tests positive 99% of the time, and a healthy person tests negative 99% of the time. The disease affects about 1 in 1,000 people. If a randomly selected person tests positive, what’s the chance they actually have the disease?
Almost everyone, including most doctors in repeated studies, says something close to 99%. The right answer is closer to 9%.
Now restate the same problem in natural frequencies: out of every 1,000 people, about 1 actually has the disease, and they will test positive. Of the remaining 999 healthy people, the 99%-accurate test will incorrectly flag about 10 as positive. So out of all the positives you see, roughly 11, only 1 actually has the disease. About 1 in 11. About 9%.
Same math. Different presentation. In repeated studies, simply switching from percentages-and-probabilities to natural-frequencies form has improved correct answers (for laypeople, for doctors, for federal judges) by as much as 60 percentage points.
The lesson Gigerenzer draws from this is profoundly democratic, and worth taking seriously. People aren’t bad at probability. The format we usually present probability in is bad for the human mind. When you give the same information in a form that matches how the brain naturally tracks frequencies — out of every N, M did this — the brain handles it correctly. This is the opposite of the popular humans are irrational and broken story you sometimes hear. It’s: humans are reasonable, if you give them information in a form they can use.
The practical version, for everyday life: any time someone tells you a percentage that matters, whether it’s a medical risk, a project’s likelihood, or an investment’s expected return, convert it in your head to out of every hundred (or thousand). A 23% risk becomes 23 out of every 100 people like me. A 1% chance of recurrence becomes 1 in 100. The numbers stay the same. Your brain becomes better at them.
The brain doing all this work has been described, in the past fifteen years of cognitive neuroscience, as a prediction machine. The phrase belongs to a research tradition called predictive processing, associated with Anil Seth, Lisa Feldman Barrett, Andy Clark, Karl Friston, and others. The strong version of the claim — that essentially all of perception, action, and even emotion is the brain running a continuous internal model and updating it against incoming sensory data — is contested. The weaker version is increasingly mainstream: the brain is doing more forecasting, more continuously, than common-sense psychology assumes. What you experience as seeing the room is mostly your brain’s best guess about what’s in the room, corrected by sensory data when the guess is wrong. What you experience as feeling sad is, in part, your brain’s prediction about the state of your body, made plausible by context.
If even a soft version of this is true, then everything else in this essay isn’t a separate cognitive activity called “forecasting.” It’s the same machine, running its standard operation, taking different inputs. The biases, the heuristics, the impact bias, the resulting habit, the probability-format mismatch: all of them are different consequences of one machine doing one thing under different conditions. This is a useful frame to keep in the back of your head, but I would not try to do too much with it; the technical machinery is its own subject, and the popularization can quickly slide into things that the field does not actually claim. The frame says: prediction is the cognitive default, not a special activity. Everything else in the bias literature sits on top of that.
Try it on yourself
The point of this whole literature, when it lands, is that calibration is something you can build through practice with feedback, applied to predictions about your own life, not predictions about world events. Most of us have never given ourselves the feedback. We’ve just lived through the predictions and forgotten them.
The tool below has two modes. The first is a quick in-session round: a small set of questions with verifiable historical answers, where you assign a probability to each and the tool computes your Brier score, the same scoring method used in the academic forecasting tournaments. The second is a personal log mode that lets you record predictions about your own coming week (will I finish this task by Friday?, will I enjoy the dinner on Saturday on a 1–10 scale?, what time will I actually leave the house tomorrow?) and compare what you predicted to what happened.
Neither mode is going to make you a superforecaster. The point is to make your own miscalibration visible, since most of us have never seen our own, and to give you something to practice against.
The benchmark, briefly
For context on what well-calibrated everyday forecasting looks like, the most rigorous large-scale data come from a research program Philip Tetlock has run for almost forty years. The first phase, his Expert Political Judgment study, ran from 1984 to 2004 and tracked the predictions of about three hundred professional pundits over two decades. The headline finding from that phase, repeated so often it has become a kind of folklore, is that the average expert performed roughly at the level of a chimpanzee with darts. This is, as it sounds, a slight overstatement; what Tetlock actually found was that experts were marginally better than chance and considerably worse than basic statistical baselines, with a small subgroup who did meaningfully better.
The second phase of his work, the Good Judgment Project, ran from 2011 to 2015 under sponsorship from the U.S. Intelligence Advanced Research Projects Activity, and identified that small subgroup empirically through a four-year forecasting tournament. They came to be called superforecasters. Their Brier scores, roughly 0.10 to 0.15, were dramatically better than the typical expert’s 0.30+ and very close to the theoretical ceiling on the questions involved. They weren’t smarter on average; they were unusually disciplined about a small set of habits, including the ones that have shown up throughout this essay: starting with the historical base rate; breaking complex questions into smaller ones; using granular probabilities, not categories; updating in small increments as evidence comes in; and a disposition toward treating their own opinions as hypotheses to be revised rather than positions to defend.
You are unlikely to outperform a superforecaster on geopolitics. The geopolitical questions are obscured environments, in the language of the first essay, where decades of structured practice barely pay off. But you almost certainly have clearer, more important calibration work to do in the areas of your own life that you have actual feedback on. Will you really like the new neighborhood? Will the project really ship on time? Will you really enjoy the trip? Will you really stick with the new habit? These are the questions where the calibration trainer above earns its keep. The superforecaster research is the benchmark. The work is yours.
One essay left. So far I’ve been talking about your own judgment: how it works, where it doesn’t, what helps, and how to look at it honestly. The last piece of the picture is what happens when these same patterns get scaled into machines, trained on our data, deployed back to us, and made into the silent partners of an increasing fraction of consequential decisions. It is not a story about robots becoming smarter than us. It is a story about us getting our own habits back, at scale, faster than we can think them through.