1-Page Summary

Superforecasting is the result of decades of research on “superforecasters”: people who can predict future events with an accuracy better than chance. Superforecasters are intelligent, but more importantly, they’re open-minded, deeply curious, and adept at sidestepping their own cognitive biases. Not everyone is cut out to be a superforecaster, but by studying the way superforecasters make predictions, anyone can improve their ability to predict the future.

Superforecasting has two authors: Dan Gardner, a journalist and author of three books on the science of prediction; and Philip Tetlock, a psychologist and pioneering forecasting researcher. Tetlock is the co-founder of two major, research-focused forecasting tournaments: Expert Political Judgment and the Good Judgment Project.

Part 1: Forecasting Basics

When the authors use the term “forecasting,” they’re referring to formal predictions expressed with numerical probabilities. To appreciate the value of forecasting, we have to frame it the right way. Tetlock learned this the hard way when the data from his pioneering research found that the majority of expert forecasts were no more accurate than chance (which the popular press misinterpreted to mean “forecasting is pointless''). Additionally, the predictions these experts made within their fields of expertise were less accurate than predictions they made outside their fields. In other words, intelligent analysts who invested time and effort into researching the issues were no more able to predict future events than if they’d guessed randomly.

(Shortform note: Why are experts seemingly so inaccurate, even within their own fields? Economics researcher Bryan Caplan points out one possible explanation of this core finding: Tetlock purposefully asked the experts challenging questions about their fields. Caplan surmises that when faced with these questions, experts become overconfident in their predictions, hence why they’re incorrect more often. He argues that if Tetlock had asked questions to which there are already well-established answers within the experts’ field (which Tetlock deliberately didn’t do), the experts’ prediction accuracy would have been higher. Caplan concludes that while forecasters admittedly need to stop being so overconfident in response to challenging questions, people should equally stop claiming that experts are useless at prediction: They can be accurate in the right circumstances.)

Superforecasters

The authors argue that contrary to the media’s representation of Tetlock’s research, these results don’t mean that there is no value in forecasting. What Tetlock’s team discovered was that certain kinds of forecasters could make certain kinds of predictions with an accuracy much higher than chance. These forecasters, whom Tetlock calls “superforecasters,” apply a specific methodology to come up with their predictions, and they only make predictions a year into the future or less. Any further out, and accuracy rates drop dramatically. But when superforecasters apply their skills to short-term questions, they’re remarkably accurate. (Shortform note: We’ll be discussing superforecasters, their methods, and their traits extensively in Part 2.)

Measuring Forecasts

Forecasting accurately is incredibly difficult, but determining whether a forecast is accurate in the first place presents difficulties of its own. According to the authors, a forecast judged by different standards than the forecaster intended will be deemed a failure, even if it’s not.

For example, in 2007, Steve Ballmer, then-CEO of Microsoft, claimed that there was “no chance” that Apple’s iPhone would get “any significant market share." In hindsight, this prediction looks spectacularly wrong, but the wording is too vague to truly judge. What did he mean by “significant”? And was he referring to the US market or the global market?

According to the authors, these questions matter because the answers lead us to very different conclusions. Judged against the US smartphone market (where the iPhone commands 42% of the market share), Ballmer is laughably wrong. But in the global mobile phone market (not just smartphones), that number falls to 6%—far from significant. (Shortform note: In 2009, Ballmer admitted to seriously underestimating the iPhone, in effect contradicting the authors: Even Ballmer thinks that the prediction was bad after all.)

Judging the “Worst Tech Predictions” of All Time

Hero Labs, a technology company, compiled a list of 22 of the “worst tech predictions of all time,” including Ballmer’s infamous quip. However, unlike Ballmer’s forecast, most of the other predictions on the list are specific enough to judge. Here’s why:

The History of Superforecasting

Tetlock first discovered that some forecasters are more accurate than others thanks to a decades-long study called “Expert Political Judgment” (EPJ). He split the forecasters in that tournament into two groups based on their performance: one that did no better (and sometimes much worse) than chance, and another that did slightly better.

Tetlock named the first group “Hedgehogs'' and the second group “Foxes," based on Isaiah Berlin’s classic philosophy essay entitled “The Hedgehog and the Fox'' (the title comes from a line from an ancient Greek poem: “The fox knows many things but the hedgehog knows one big thing”). (Shortform note: “Knowing many things'' is similar to the psychological concept of active open-mindedness. Actively open-minded thinkers are willing to consider other people’s ideas and opinions instead of clinging to their own point of view. Tetlock found that superforecasters tend to be more actively open-minded than most people.)

According to the authors, forecasters in the “hedgehog” group are passionately ideological thinkers who see the world through the lens of a Big Idea. They organize new information to fit their Big Idea, and they ignore any information that doesn’t fit that paradigm. The “Big Ideas” themselves vary widely from liberal to conservative and everything in between. Foxes, on the other hand, are “eclectic experts” who have a wide range of analytical tools at their disposal rather than a single Big Idea. The authors argue that this allows foxes to be more flexible, changing their approach based on the particular problem. Foxes vastly outperformed hedgehogs in the EPJ.

Foxes Make Better Forecasters, Hedgehogs Make Better CEOs

A fox mentality is not always preferable to a hedgehog mentality. In fact, in some fields, hedgehogs have a distinctive advantage. For example, while researching for the book Good to Great, author Jim Collins and his research team interviewed leaders of companies that vastly outperform other companies in their respective industries. Collins found that the CEOs of great companies were overwhelmingly hedgehogs, not foxes. In fact, the hedgehog mentality was so crucial to success that these companies tended to organize their entire business models around the hedgehog’s one Big Idea.

Based on what we’ve learned so far about foxes and hedgehogs, this might be somewhat counterintuitive. If foxes make better forecasters, and CEOs need to be able to predict the outcomes of their business decisions, shouldn’t CEOs strive to be foxes? Collins argues that a fox mentality considers too many variables, which ultimately distracts leaders from pursuing the company’s core mission (or, as Collins calls it, their “Hedgehog Concept”). Instead, strong business leaders should boil down the answers to three questions: What can I do better than anyone else? What is my financial engine? And what am I most passionate about? The combined answers to these questions form the core of the Hedgehog Concept, and pursuing that central concept—without getting distracted by outside factors—is what takes a company from good to great.

Part 2: Traits of Superforecasters

The authors argue that what makes superforecasters truly “super” isn’t how smart they are—it’s the way they use their intelligence to approach a problem. Let’s examine the various ways they do this in detail.

Trait 1: They Avoid Cognitive Biases

According to the authors, the reason superforecasters make such accurate predictions is that they’re adept at avoiding cognitive biases. We’re all prone to certain cognitive biases that stem from unconscious thinking, or what psychologists like Tetlock often describe as “System 1” thinking. These biases skew our judgment, often without us even noticing. Superforecasters constantly monitor and question their System 1 assumptions. (Daniel Kahneman describes the two-system model of thinking in Thinking, Fast and Slow; System 2 governs conscious, deliberate thinking, while System 1 functions automatically and unconsciously.)

(Shortform note: Kahneman argues that System 1 is also prone to making snap predictions. For example, when we see someone with an angry expression, System 1 automatically predicts that person is about to start yelling. Those automatic predictions could be another mental trap for prudent forecasters to avoid.)

Trait 2: They Generate Multiple Perspectives

Superforecasters also rely on aggregated judgment (aggregation is the process of combining data from multiple sources). Tetlock and Gardner argue that aggregation is a powerful tool for forecasters because the aggregated judgment of a group of people is usually more accurate than the judgment of an average member of the group.

Superforecasters use aggregation by pulling from many sources and using many tools to produce an answer, despite being just one person. This skill doesn’t come naturally to most people—we struggle to even recognize that there are other perspectives beyond our own “tip-of-the-nose” view, let alone fully consider those ideas. This is part of what sets superforecasters apart.

How to Generate New Perspectives Like a Superforecaster

To think like a superforecaster, you need to look beyond the tip-of-your-nose view and find new ways of viewing a problem. Here are some tips to get you in the right mindset:

Trait 3: They Think in Probabilities

According to the authors, superforecasters are probabilistic thinkers. This goes beyond just phrasing their forecasts in terms of probability percentages. In situations where most of us think in terms of black and white, superforecasters think in shades of grey.

Most people’s mental “probability dial” has three distinct settings: yes, no, and maybe. By contrast, probabilistic thinkers have an unlimited number of settings. They’re more likely to answer questions in terms of percentages rather than “yes” or “no.” And this is not just in the realm of forecasting—this is how superforecasters normally think and speak in their everyday lives.

(Shortform note: Superforecasters’ emphasis on probabilistic thinking may help to explain the gender gap among superforecasters, who tend to be male. Young children perform about the same on tests of probabilistic thinking regardless of gender—however, by age 10, boys tend to outperform girls, a pattern that holds true for many other math skills. This could be because, as Sheryl Sandberg argues in Lean In, social norms discourage girls from pursuing math and science.)

Trait 4: They Think From the Outside In

Tetlock and Gardner argue that when superforecasters first encounter a question, they begin by looking at the wide perspective of that question before accounting for the specifics (in Thinking, Fast and Slow, Daniel Kahneman calls that wider perspective the “outside view”). Compare this to the “inside view,” which describes the particular details of a situation.

For example, imagine someone tells you about their physician friend, Dr. Jones, and asks you to estimate the likelihood that Dr. Jones is a pediatrician. If you start with the inside view, you’ll analyze the specifics of Dr. Jones’s life and personality and make predictions based on what you find. The trouble is, specifics can often lead us to make random and extreme guesses. If we’re told that Dr. Jones loves children and worked at a summer camp for sick children during college, we might say it’s 80% likely that Dr. Jones is a pediatrician. On the other hand, if we’re told that Dr. Jones is a very serious, reserved person and has no plans to become a parent, we might swing to the other extreme and guess 2%.

In contrast, if you start with the outside view, you’ll ignore any details about the specific person. Instead, you’d try to answer the question “What percentage of doctors specialize in pediatrics overall?” This gives you a base rate from which to calibrate your prediction, which is more likely to lead to an accurate forecast than if you begin with a random “inside view”-inspired guess.

Master the Outside View With a Premortem

In Thinking, Fast and Slow, Daniel Kahneman advises using a “premortem” analysis to avoid the dangers of inside-out thinking. A premortem analysis is a mental exercise in which you imagine that whatever you’re working on (be it a project or a forecast) has already come to fruition—and was a complete disaster. Your goal is to come up with as many reasons as possible to explain this hypothetical “failure.”

This approach is helpful because, by nature, the inside view makes a situation feel “special” because it predisposes you to focus on what makes the situation unique. That feeling can make it more difficult to notice biases in your answer because you might assume the current situation won’t abide by the usual “rules.” For example, most newlyweds probably don’t expect to ever get divorced, despite the 40-50% divorce rate. That’s because, from the inside, the relationship feels “special” or distinct from the relationships that ended in divorce.

The premortem technique can help you reorient to the outside view because assuming your answer is incorrect will likely force you to recognize that the specifics of this situation aren’t as important as the base rate. For example, if you’re predicting whether a startup will succeed, it’s tempting to take the inside view and make your forecast based on the business model or the founder’s previous business experience. However, if you try a premortem analysis, it will be easy to come up with reasons the company failed given that the failure rate for startups is roughly 90%. That sobering statistic can help remind you that even if the inside view looks like a recipe for success, the odds are stacked so strongly against new businesses that failure is much more likely.

Trait 5: They Have a Growth Mindset

Forecasting involves quite a bit of failure because forecasters are asked to predict the unpredictable. While no one enjoys being wrong, the authors argue that superforecasters are more likely than regular forecasters to see their failures as an opportunity to learn and improve. Educational psychologists call this a “growth mindset.” People with a growth mindset believe that talent and intelligence can be developed through learning and practice.

The idea behind the growth mindset seems intuitive, but in practice, the authors report that most of us gravitate towards a “fixed mindset” instead. The fixed mindset tells us that talent and intelligence are traits we’re born with, so practice can only strengthen the natural abilities that are already there.

Grow Your Own Growth Mindset

Like many other superforecaster skills, a growth mindset isn’t an inborn trait—it can be grown and developed with practice. In Mindset, psychologist Carol Dweck lays out a few concrete tips to help you transition from a fixed mindset to a growth mindset.

Trait 6: They’re Intellectually Humble

According to the authors, superforecasting requires the humility to admit when you don’t know the answer and to acknowledge that bias might cloud your judgment. This is called intellectual humility, which is an acknowledgment of the power of randomness. It involves admitting that some things are impossible to predict or control, regardless of your skill.

Professional poker player Annie Duke describes this as the difference between “humility in the face of the game” and “humility in the face of your opponents.” In other words, Duke’s long record of success indicates that she is an exceptionally talented poker player and is probably more skilled than most of her opponents. But all of Duke’s skill and experience doesn’t mean she will automatically win every game or that she is even capable of fully understanding every possible intricacy. Like superforecasters, her skills allow her to beat her opponents but not the game itself.

To Foster Humility, Understand the Role of Luck in Success

Annie Duke’s distinction between “humility in the face of the game” and “humility in the face of your opponents” reflects author Nassim Taleb’s views on luck and success. In Fooled by Randomness, Taleb argues that, while skill is a good predictor of moderate success, luck is a better predictor of wild success.

Similarly, Duke understands that winning at poker requires a certain degree of luck; if she were extremely skilled but terribly unlucky, she’d be able to carve out a decent record, but she certainly wouldn’t be the champion player she is now. Therefore, Duke is able to remain humble because she understands that no matter how well she plays, she’s always one streak of bad luck away from a loss.

Trait 7: They’re Team Players

In forecasting tournaments, superforecasters work in teams to create forecasts. According to the authors, one feature of successful teams is the way they freely share resources with each other. Psychologist Adam Grant calls people who give more than they receive “givers.” He compares them to “matchers” (who give and take in equal measure) and “takers” (who take more than they give). Grant found that givers tend to be more successful than matchers or takers. Tetlock and Gardner argue that successful superforecasting teams tend to be stacked with givers.

Superforecasting Teams Are Optimally Distinct

In his book Give and Take, Adam Grant offers another clue as to what makes superforecasting teams so prone to generosity. Grant argues that we’re more motivated to help people who are part of our own social and identity groups (which can be anything from immediate family to school classmates to fellow football fans). Additionally, the more unique the group is compared to the dominant culture, the more inclined members are to help one another (this is called “optimal distinctiveness”). Participating in forecasting tournaments is a rare hobby, and superforecasters are a unique subgroup of forecasters; that uniqueness might strengthen their group identity and make them even more likely to share resources with one another.

Part 3: Can Forecasting Solve the Most Important Questions?

According to the authors, the field of forecasting is facing an important challenge: Namely, the idea that the questions people really care about and need to answer are typically too big for a forecaster to even attempt. For example, a solid superforecaster can predict the likelihood that China will begin closing any of its hundreds of coal plants (which experts say could help the country meet its environmental goals), but they can’t answer the real question people are asking: “Will we be able to prevent the most devastating effects of climate change?”

This is a valid criticism—luckily, Tetlock and Gardner argue that we can get around it by breaking big questions like “Will things turn out okay?” into a host of smaller questions that superforecasters can answer. This is called Bayesian question clustering. The answers to these questions contribute a small piece of the overall answer. Cumulatively, the answers to those small questions can approximate an answer to the bigger question.

For example, if we ask enough questions about factors that could contribute to worsening climate change, we know that the more “yes” answers we get to the small questions (for example, whether sea levels will rise by more than one millimeter in the next year, or whether the United States government will invest more money in solar energy), the more likely the answer to the big question is also a “yes.”

(Shortform note: This technique may help to answer a common critique of forecasting: that it is an example of the “streetlight effect,” or the equivalent of looking for your lost keys under the streetlight—even if that’s not where you lost them—because that’s where the light is best. This is related to black swan thinking—whatever future events you can predict (metaphorically shine a light on) won’t matter because the only truly important events are, by definition, unpredictable. To see the utility of Bayesian question clustering, we can change the metaphor a bit: If a forecaster searches for multiple puzzle pieces under the streetlight as opposed to a single set of keys, they may find enough pieces to at least see the gist of the whole puzzle—even if half the pieces are still lost in the dark.)

Shortform Introduction

Superforecasting is the result of decades of research on “superforecasters”: people who can predict future events with an accuracy better than chance. Superforecasters are intelligent, but more importantly, they’re open-minded, deeply curious, and adept at sidestepping their own cognitive biases. Not everyone is cut out to be a superforecaster, but by studying the way superforecasters make predictions, anyone can improve their ability to predict the future.

About the Authors

Phillip Tetlock is a Canadian-American author and researcher focusing on good judgment, political psychology, and behavioral economics. He currently teaches at the University of Pennsylvania in the Wharton School of business as well as the departments of psychology and political science.

In 2011, Tetlock and his spouse, psychologist Barbara Mellers, co-founded the Good Judgment Project, a research project involving more than twenty thousand amateur forecasters. The Project was designed to discover just how accurate the best human forecasters can be and what makes some people better forecasters than others. It led to Tetlock discovering a group of “superforecasters” who could outperform professional intelligence analysts; these findings eventually inspired Superforecasting.

Today, the Good Judgment Project hosts training sessions on general forecasting techniques for both individuals and organizations as well as producing custom forecasts.

Connect with Philip Tetlock:

Dan Gardner is a Canadian journalist and co-author of Superforecasting. Gardner has authored two other books on the psychology of decision-making and prediction: Risk: The Science and Politics of Fear in 2008, which delves into the science of how we make decisions about risky situations; and Future Babble, in 2011, which explores Tetlock’s earlier research on experts’ inability to accurately predict the future. He is a senior fellow at the University of Ottawa’s Graduate School of Public Policy and International Affairs.

Connect with Dan Gardner:

The Book’s Publication

Publisher: Crown Publishing Group, a subsidiary of Penguin Random House

Superforecasting was published in 2015. It is Tetlock’s fourth book (Gardner’s third) and is the most well-known book in both authors’ respective bibliographies.

Superforecasting builds on Tetlock’s previous book, Expert Political Judgment, in which he first described the results of the Good Judgment Project and answered the question, “Why are political experts so bad at making predictions?” In Superforecasting, Tetlock turns his attention away from experts’ failures and toward the successes of a few average people who can predict global events more accurately than chance.

The Book’s Context

Intellectual Context

The idea for Superforecasting came from Tetlock’s experiences with the Expert Political Judgment and Good Judgment Project forecasting experiments, during which Tetlock and his research team discovered that some people are significantly better at predicting the future than others. Tetlock and Gardner set out to explore exactly what sets this group of “superforecasters” apart from regular forecasters; Superforecasting is the result.

Superforecasting fits in with the tidal wave of research on issues of predictability, uncertainty, and cognitive biases that happened in the 2000s and 2010s. Superforecasting directly references Daniel Kahneman’s Thinking, Fast and Slow; Kahneman’s pioneering research on metacognition and cognitive biases paved the way for Tetlock’s study of the limits of forecasting. Tetlock and Gardner also directly reference Nassim Nicholas Taleb, author of several books on uncertainty, including The Black Swan, Antifragile, and Fooled by Randomness. Taleb is deeply critical of forecasting as an enterprise, and the authors devote an entire section of Superforecasting to addressing his critiques.

At its core, Superforecasting is an exploration of how superforecasters think, with a particular emphasis on the way they avoid knee jerk assumptions and cognitive biases. This presents a helpful counterpart to books like Dan Ariely’s Predictably Irrational and Malcolm Gladwell’s Blink, both of which explore the risks and benefits of subconscious thinking in more depth.

The Book’s Impact

Superforecasting was a New York Times bestseller and was listed as “One of the Best Books of 2015” by Amazon, Bloomberg, and The Economist. The book’s intellectual impact was strongest within the fields of forecasting and behavioral science. For example, poker champion and author Annie Duke references Superforecasting in her 2018 book, Thinking in Bets, in which she builds on Tetlock’s research on avoiding cognitive biases and creating teams of like-minded people to help make decisions.

Superforecasting also played a small role in a British government scandal. Dominic Cummings, former aide to UK Prime Minister Boris Johnson, told reporters to read the book instead of listening to “political pundits who don’t know what they’re talking about.” He even wrote his own glowing review of the book on his personal blog. Cummings also hired Andrew Sabisky, a superforecaster, as an advisor. However, Sabisky resigned after only a few days on the job after old blog posts surfaced in which he claimed that Black people have lower IQs than white people and that the government should adopt eugenics policies to prevent the growth of the “lower class.” For many people, this scandal was their first encounter with the concept of “superforecasting.”

The Book’s Strengths and Weaknesses

Critical Reception

Superforecasting was generally well-received by critics and readers. People in a wide variety of fields—including real estate, ecology, management, and actuarial science—gave Superforecasting positive reviews and recommended it to anyone in their respective industries who wants to improve their decision-making skills. A New York Times reviewer praised the book for providing practical advice on how to make better forecasts rather than just describing the forecasting process. Some reviewers even tried their hand at making their own forecasts after reading the book.

Critical reviews of Superforecasting focus not so much on the book itself as on the utility of forecasting as a practice. One reviewer argued that, while superforecasters may be able to make accurate predictions about very specific events in the near future, the types of events they predict are not the ones we should be most worried about. Instead, we should focus on the events that are likely to have the biggest impact on society—which are likely to be so rare that they are completely unpredictable. These events are what author Nassim Nicholas Taleb calls “black swans,” which we’ll explore in depth in this guide.

Commentary on the Book’s Approach

While Superforecasting has two official authors, the book is written in Tetlock’s voice as he describes his personal experience conducting decades of research on forecasting tournaments. Tetlock and Gardner accurately represent the criticisms other authors (such as Taleb) have levied against formal forecasting and make a compelling case for the utility of forecasting as an enterprise, especially on a national and global scale. The authors also describe the techniques that superforecasters use in enough detail that readers come away well-equipped to try making their own predictions. (If you’d like to test your forecasting skills after reading, you can take on one of the challenges in the Good Judgment Open, an online, ongoing forecasting tournament.)

Commentary on the Book’s Organization

Superforecasting begins with an overview of Tetlock’s approach to forecasting, including both his optimism about the strengths of “superforecasters” as well as his understanding that, while superforecasters make more accurate predictions than other forecasters, there is still a hard limit to how far any human can see into the future. The authors then discuss the difficulties of measuring a forecaster’s accuracy, how the lack of a solid measurement system has significantly hindered forecasting research, and how Tetlock’s “Expert Political Judgment” and “Good Judgment Project” forecasting tournaments changed the landscape of forecasting research. The bulk of Superforecasting describes the skills and traits of the superforecasters themselves with the goal of advancing the authors’ thesis: that the skills that make superforecasters so “super” aren’t inborn—they can be developed with practice.

The book’s organization serves the authors’ goal of introducing the reader to the world of formal forecasting, holding up superforecasters as a model of the full potential of human forecasting, and arguing for the importance of forecasting tournaments. The authors seem to understand that the kind of formal, geopolitical forecasting that goes on in forecasting tournaments is completely foreign to many readers, so they begin with very general concepts (like the idea that, to a certain degree, the future is knowable) before getting into more specific details of what makes superforecasters so good at what they do.

One downside to the authors’ decision to move from broad concepts to specific details is that certain core ideas are separated into multiple parts of the book, which can be confusing for the reader. For example, the sections on calculating Brier scores and measuring regression to the mean—two ways of measuring forecasters’ performance—are located in two different chapters.

Our Approach in This Guide

In this guide, we’ve moved some sections of the book’s original chapters to other chapters to keep information on major ideas together. For example, the section on cognitive biases from the book’s Chapter 2 is now in Chapter 5, which outlines common traits of superforecasters. That’s because avoiding common biases is an important skill that superforecasters practice.

To begin the guide, we’ll discuss the history and underlying theory of formal forecasting (including the forecasting tournaments that ultimately led to the discovery of superforecasters) to give readers an overview of the field. Then, we’ll discuss the process of identifying “superforecasters” using precise measurements of accuracy. Next, we’ll move into the largest section of the guide: an overview of the traits that make superforecasters so good at predicting the future, such as embracing uncertainty, updating their predictions, and being generous teammates. Finally, we’ll address the major criticisms of forecasting (such as the relative importance of unpredictable “black swan” events) and how those concerns might impact the future of forecasting.

Throughout the guide, we’ll compare Tetlock and Gardner’s approach to ideas from other popular books on forecasting, decision making, behavioral economics, and cognitive psychology, such as Daniel Kahneman’s Thinking, Fast and Slow; Malcolm Gladwell’s Blink; Dan Ariely’s Predictably Irrational; and Nassim Nicholas Taleb’s Fooled by Randomness, The Black Swan, and Antifragile. Taleb is highly critical of forecasting, and we’ll explore these criticisms (and the resulting professional friction between Tetlock and Taleb) in detail.

Part 1: Forecasting Basics | Chapters 1-2: Can You Predict the Future?

In this chapter, we’ll learn about the importance of measuring forecast accuracy, the philosophy that makes forecasting possible, what it means to be a “superforecaster,” and how superforecasters perform compared to computer algorithms.

Lack of Measurement Makes It Difficult to Judge Accuracy of Forecasts

Tetlock and Gardner argue that to make better forecasts, we have to be able to measure accuracy. Predictions about everything from global politics to the weather are not hard to come by. You find them on news channels, in bestselling books, and among friends and family. According to the authors, most of these predictions have one thing in common: After the event, no one thinks to formally measure how accurate they were. This lack of measurement means that you have no sense of how accurate any particular source usually is. Without that baseline, how do you know who to listen to the next time you need to make a decision?

The authors note that given how important accurate predictions are, it’s surprising that we have no standard way of measuring their accuracy. Instead, forecasters in popular media deliver their predictions with so much confidence that we take them at their word, and by the time the events they predict happen (or don’t), the news cycle has moved on. According to Tetlock and Gardner, the loudest voice is often the most convincing one, regardless of how accurate they are.

Popular Forecasters Should Be Held Accountable

Nassim Nicholas Taleb takes this argument one step further in Antifragile. He argues that not only should pundits be measured on the accuracy of their predictions, they should be held personally liable for the consequences of their predictions. For example, in 2003, Thomas Friedman, popular columnist and author of Thanks for Being Late, predicted that if U.S. military forces invaded Iraq, it would have “a positive, transforming effect on the entire Arab world.” Taleb argues that Friedman’s forecasts influenced the decision to invade Iraq, leading to an eight-year war and 288,000 total casualties; however, Friedman himself never faced any consequences for these results of his predictions. In Taleb’s view, Friedman should, at least, be banned from writing further op-eds, since his words have proven to have dangerous consequences.

The Philosophy of Forecasting

To appreciate the value of forecasting, we have to frame it the right way. Tetlock learned this the hard way when the data from his pioneering research found that the majority of expert forecasts were no more accurate than chance (which the popular press misinterpreted to mean “forecasting is pointless”). Additionally, the predictions these experts made within their fields of expertise were less accurate than predictions they made outside their fields. In other words, intelligent analysts who invested time and effort into researching the issues were no more able to predict future events than if they’d guessed randomly.

(Shortform note: Why are experts so inaccurate, even within their own fields? Economics researcher Bryan Caplan points out one possible explanation of this core finding: Tetlock purposefully asked the experts challenging questions about their fields. Caplan surmises that when faced with these questions, experts become overconfident in their predictions, hence why they’re incorrect more often. He argues that if Tetlock had asked questions to which there are already well-established answers within the experts’ field (which Tetlock deliberately didn’t do), the experts’ prediction accuracy would have been higher. Caplan concludes that while forecasters admittedly need to stop being so overconfident in response to challenging questions, people should equally stop claiming that experts are useless at prediction: They can be accurate in the right circumstances.)

Chaos Theory

Tetlock and Gardner argue that this nihilistic perspective has its roots in chaos theory. This theory, first proposed by a meteorologist, argues that tiny changes in nonlinear systems can escalate into enormous effects. This means that the effects of changes in these systems are hard, if not impossible, to predict.

The authors note that you may have heard this referred to as “the butterfly effect," a moniker that stems from the title of the original 1972 paper: “Predictability: Does the Flap of a Butterfly’s Wings in Brazil Set Off a Tornado in Texas?” Most of the time, atmospheric conditions would prevent a Brazilian butterfly from setting off a tornado in Texas—but if every other factor lined up just right, there is a tiny chance that it could happen. This possibility is so unlikely and requires such a specific arrangement of variables that it is almost entirely unpredictable, even with the knowledge of every law of the universe.

History Is a Nonlinear System

Chaos theory only affects nonlinear systems, which are systems in which the input does not reliably predict the output—essentially, any system that can be impacted by random, unpredictable events. For example, the equation y=x is linear; 1 will always equal 1, 2 will always equal 2, and so on, regardless of any other factors. However, the weather is a nonlinear system because the slightest change in temperature could lead to a huge storm developing. The change in input is not proportional to the change in output.

History itself is a nonlinear system. For example, many people cite Adolf Hitler’s rejection from art school as the butterfly wing that ultimately caused the Holocaust (because if he’d been accepted, he may have been too focused on his art career to pursue politics). Being rejected from art school has a set of predictable results: The person might keep applying to other schools until they’re accepted, go out on their own as an artist without formal schooling, or give up art altogether. At the time, no one predicted that being rejected from art school would lead to someone becoming the most hated figure in history because the input (rejection from art school) was disproportionate to the output (becoming a dictator).

We can further classify history as a particular type of nonlinear system: what author Yuval Noah Harari calls a level two chaotic system. In Sapiens, Harari describes level two systems as those in which prediction can change the outcome of events. For example, if the art school administrators who evaluated Hitler’s application had predicted that rejecting him would lead to a world war, they may have accepted his application instead, thereby potentially nullifying their own prediction. This is distinct from a level one system in which outcomes are not impacted by prediction. For example, the weather is a level one system—if you predict that tomorrow will be cloudy, that doesn’t influence the likelihood that tomorrow will actually be cloudy.

Superforecasters

The authors argue that contrary to the media’s representation of the Good Judgment Project, the Project’s results don’t mean that there is no value in forecasting. What Tetlock’s team discovered was that certain kinds of forecasters could make certain kinds of predictions with an accuracy much higher than chance. These forecasters, whom Tetlock calls “superforecasters,” apply a specific methodology to come up with their predictions, and they only make predictions a year into the future or less. Any further out, and accuracy rates drop dramatically. But when superforecasters apply their skills to short-term questions, they’re remarkably accurate. (Shortform note: We’ll be discussing superforecasters, their methods, and their traits extensively in Part 2.)

Humans vs Computer Algorithms

But, the authors question, why rely on humans for forecasting at all, instead of an advanced computer algorithm? For situations where a well-validated statistical algorithm exists and has been proven reliable, computer predictions are almost always more accurate than those of human forecasters. The problem is that very few such algorithms exist.

Tetlock and Gardner believe that as technology advances, we may develop more of these algorithms, or refine the ones that already exist. But even then, the judgment of human forecasters will not be obsolete. Artificial intelligence (AI) is mostly immune to human cognitive biases, but it can’t interpret the results of its own predictions and create new meaning. Therefore, the authors argue, the future of forecasting will be a combination of skilled forecasters and powerful algorithms.

Forecasting Algorithms and AI Won’t Be Immune to Bias

Tetlock and Gardner argue that algorithms and AI are mostly immune to bias—however, that’s not always the case. For example, in Biased, social psychologist Jennifer Eberhardt describes how AI has become a major tool in criminal justice settings, where it is used to predict the risk of an arrested person committing another crime. Those risk assessments are then used to set bail and even influence sentencing. However, these measures aren’t objective—an independent analysis found that, regardless of criminal history, the system was 77% more likely to assign a “high risk for violent crime” label to Black people. Furthermore, the system was more likely to mistakenly assign a “low risk” label to white people who did go on to commit additional crimes.

This kind of bias pops up in other artificial intelligence tools—particularly those that involve facial recognition technology. While facial recognition systems are often remarkably accurate at identifying white male faces, they’re significantly less accurate when identifying the faces of women or people of color. When it comes to forecasting, relying on AI alone likely won’t be enough to sidestep biases.

Chapter 3: Measuring Forecasts

Given all the ways our brains can work against us, forecasting accurately is incredibly difficult. But determining whether a forecast is accurate in the first place presents difficulties of its own. According to the authors, a forecast judged by different standards than the forecaster intended will be deemed a failure, even if it’s not.

For example, in 2007, Steve Ballmer, then-CEO of Microsoft, claimed that there was “no chance” that Apple’s iPhone would get “any significant market share." In hindsight, this prediction looks spectacularly wrong, but the wording is too vague to truly judge. What did he mean by “significant”? And was he referring to the US market or the global market?

According to the authors, these questions matter because the answers lead us to very different conclusions. Judged against the US smartphone market (where the iPhone commands 42% of the market share), Ballmer is laughably wrong. But in the global mobile phone market (not just smartphones), that number falls to 6%—far from significant. (Shortform note: In 2009, Ballmer admitted to seriously underestimating the iPhone, in effect contradicting the authors: Even Ballmer thinks that the prediction was bad after all.)

Although Ballmer’s infamous iPhone forecast seems clear at first, the authors argue that it’s actually ambiguous. Certain words can be interpreted differently by different people, and forecasts tend to be full of these words (like “significant” and “slight”). This ambiguity makes the prediction complicated to judge.

The authors note that lack of timelines is another common problem in forecasts. If someone says “the world will end tomorrow," that has a clear end date—tomorrow, if the world has not ended, we can safely say they were wrong. But if someone says “the world will end," any arguments to the contrary can be met with “just wait and see." We can’t prove the forecaster wrong.

Judging the “Worst Tech Predictions” of All Time

Hero Labs, a technology company, compiled a list of 22 of the “worst tech predictions of all time,” including Ballmer’s infamous quip. However, unlike Ballmer’s forecast, most of the other predictions on the list are specific enough to judge. Here’s why:

Probabilities Are Useful Estimates, Not Facts

Probability is one of the biggest obstacles to judging the accuracy of forecasts. Calculating the probability of pulling a blue ball out of a bag is fairly easy—even if you don’t know any probability formulas, you can just keep blindly pulling a ball out of the bag, recording its color, then putting it back and repeating the process. After enough trials, it would be easy to say which color ball you’re most likely to draw and about how much more likely you are to draw that color than the other.

However, according to the authors, attaching an accurate number to the probability of a real-world event is almost impossible. To do so, we’d need to be able to rerun history over and over again, accounting for all the different possible outcomes of a given scenario. This means that for most events that forecasters are concerned with, it is impossible to know for sure that there is a specific probability of the event happening. Therefore, any probability attached to an event in a forecast is only the forecaster’s best guess, not an objective fact.

Some Probabilities Are More Accurate Than Others

Assigning accurate numerical probabilities is easier for certain types of forecasts than others. For example, we learned earlier that history is a nonlinear system, which means that there are so many possible variables that could influence the outcome of a given event that it’s impossible to assign an accurate probability.

However, some situations have happened so many times already that it’s possible to track the outcomes and use those numbers to generate relatively accurate probabilities for future outcomes. This is the case for college admissions rates. In Smarter Faster Better, author Charles Duhigg tells the story of a high school student anxiously calculating the odds of getting into college. He chose 12 schools to apply to and researched the admissions rates for each, then added those probabilities together. He realized that, while the odds of being accepted to his top-choice school were fairly low, the odds of being accepted to any school were high. Therefore, while he couldn’t predict which college he’d attend with any certainty, he could at least be fairly certain that he’d be in college somewhere the following year, which helped to ease his anxiety.

In this case, the student can be more confident in the numerical odds he calculated because the admissions rate data he used is based on hundreds of thousands of applicants. The aggregate of all of those decisions creates a more accurate base rate to predict future decisions.

Estimated Probabilities Still Have Value

While the fact that estimated probabilities are only “best guesses” can be misleading, it doesn’t mean that these probabilities are useless. In fact, the authors argue that using numerical probability estimates in forecasts is critical because miscommunicating the odds in a forecast can have serious, global consequences.

That claim may sound dramatic, but it’s exactly what happened in 1961 when President Kennedy commissioned the Joint Chiefs of Staff to report on his plan to invade Cuba. The final report predicted a “fair chance” of success, and the government went ahead with what became the Bay of Pigs disaster. After the fact, it was clarified that “fair chance” meant three to one odds against success, but President Kennedy interpreted the phrase more positively and acted accordingly. (Shortform note: The vague language of the report may not have been the only reason the government went ahead with such a flawed plan. In Mindset, psychologist Carol Dweck argues that Kennedy’s team may have been so blinded by his charisma as a leader that they accepted his ideas uncritically.)

The authors note that in the aftermath of the failed Bay of Pigs invasion, Sherman Kent, head of forecasting for the CIA, proposed a universal standard for official forecasts that would eliminate ambiguity by assigning numerical probabilities to particular words (so “almost certain” described events to which forecasters assigned 87-99% probability, for example). His proposal was rejected outright by the intelligence community, who felt that expressing probabilities numerically was crude and misleading. They feared readers would fall into the common trap of interpreting numbers to mean something is X percentage likely to happen, not that the forecaster believes that to be the likelihood. As a result, intelligence reports tend to exclude any specific probabilities.

Vague Forecasts Lead to Misguided Decisions

As Tetlock and Gardner argue, the intelligence community’s resistance to using numerical probabilities in forecasts stems from the fear that numbers inspire overconfidence in policymakers. However, research shows that the opposite is true: Using numerical probabilities in forecasts actually makes policymakers more cautious, not less. They’re also more likely to seek out more information before committing to a decision. Furthermore, in the Bay of Pigs example, it was a vague probability that inspired overconfidence in Kennedy and his team—not a specific, numerical one.

Brier Scores Measure Forecaster Accuracy

According to the authors, when a forecast is clear enough to judge conclusively, we can use that prediction to measure how accurate a particular forecaster is in general by measuring the distance between their forecast and whether or not the event actually occurred. Over the course of several forecasts, this process yields a measure called a Brier score. Brier scores range between 0 and 2, where zero is an absolutely perfect forecast and two is a forecast that is wrong in every possible way. Random guessing, over time, produces a score of .5.

The authors explain that there are two components of a Brier score: calibration, or how accurate a forecast is, and resolution, or how confident the prediction was. A forecaster who always predicts probabilities near the level of chance (50%) will be fairly well-calibrated, but the information isn’t helpful—it’s the mathematical equivalent of a shrug. According to the authors, stronger forecasters are accurate outside the range of chance—they’re confident enough to assign much higher or lower odds to a particular event (such as 90% or 10%), despite the increased risk of being wrong. These forecasters will be well-calibrated and have high resolution.

The authors caution that a forecaster’s Brier score is only meaningful in the context of the types of forecasts they make. For example, a forecaster who predicts the outcomes of basketball games might have a Brier score of .2, which looks quite impressive. However, basketball game outcomes are relatively easy to predict compared to other sports because the scores are much higher and players score frequently, so it’s easier to come from behind than in sports with less frequent scoring. Thus, in context, that .2 score isn’t as impressive as it looks.

The authors note that Brier scores also give us a way to compare one forecaster to another—we can say that a forecaster with an overall Brier score of .2 is a more accurate forecaster than someone with a score of .4. But context is important here, too, because Brier scores don’t account for the difficulty of each prediction. For example, it’s notoriously difficult to predict the outcome of baseball games because chance plays such an important role. Therefore, a forecaster who successfully predicts the outcomes of a series of baseball games is more impressive than one who successfully predicts the outcomes of a series of basketball games. Even if the baseball forecaster’s score is slightly higher (and thus less accurate), earning that score in more volatile circumstances is still more impressive than a better score for predicting basketball.

Brier Scores vs Net Brier Points

While Brier scores are an effective way to measure a forecaster’s skill, both individually and compared to other forecasters, they don’t take the number of forecasts made into account. That can skew the results because Brier scores are calculated by averaging a forecaster’s accuracy across every forecast they make for a particular question.

We can illustrate this with an example. Let’s say that every day from Sunday to Friday, Judy predicts the likelihood that it will rain on Saturday (so she makes six forecasts in all). Judy’s friend Jen also predicts the likelihood that it will rain on Saturday, but she only participates on Thursday and Friday (so she makes only two forecasts). Judy’s task was much harder—she made predictions early in the week while Jen waited until the clouds gathered on Thursday and it was very obvious that it would rain that weekend. However, Jen’s Brier score is much better because her accuracy is only divided across two forecasts, not six.

To combat this, some forecasting tournaments use something called “net Brier points” rather than simple Brier scores. Net Brier points are calculated by comparing each forecaster’s score to the median score for all forecasters, then averaging that across every day the question was open, not just the days that particular forecaster participated. With this system, Judy would be rewarded for making earlier, more difficult forecasts and would end up with a slightly better Brier score than Jen.

Chapter 4: The History of Superforecasting

Tetlock first discovered that some forecasters are more accurate than others thanks to a decades-long study called “Expert Political Judgment” (EPJ). As the authors describe, the results of the EPJ revealed that overall, the average expert’s predictions were no more accurate than chance. But a closer look at the data revealed two subgroups of forecasters: one that did no better (and sometimes much worse) than chance, and another that did slightly better.

The second group just barely surpassed the rate of chance, but even that slight edge statistically differentiated them from the first group. Tetlock named the first group “Hedgehogs'' and the second group “Foxes," based on Isaiah Berlin’s classic philosophy essay entitled “The Hedgehog and the Fox” (the title comes from a line from an ancient Greek poem: “The fox knows many things but the hedgehog knows one big thing”). (Shortform note: “Knowing many things'' is similar to the psychological concept of active open-mindedness. Actively open-minded thinkers are willing to consider other people’s ideas and opinions instead of clinging to their own point of view. As we’ll see in Chapter 9, superforecasters tend to be more actively open-minded than most people.)

On average, these two groups were equally intelligent and experienced. So what distinguishes them?

Hedgehogs

According to the authors, forecasters in the “hedgehog” group are passionately ideological thinkers who see the world through the lens of a Big Idea. They organize new information to fit their Big Idea, and they ignore any information that doesn’t fit that paradigm. The “Big Ideas” themselves vary widely from liberal to conservative and everything in between.

The hedgehog’s preoccupation with one Big Idea biases their predictions. Because hedgehogs are so passionate, they’re more likely to make bold predictions with probabilities closer to 0% or 100% rather than stick close to a safe guess of 50%. They’re convinced that their Big Idea is “right,” and all other ideas are “wrong,” and their forecasts reflect that level of certainty. Unfortunately, Tetlock and Gardner argue that most forecasting requires a more nuanced approach, so hedgehogs’ bold predictions tend to overshoot the mark. In the EPJ project, hedgehogs did worse than foxes in both calibration and resolution. (Shortform note: In an interview, Tetlock argued that hedgehogs are more likely to embrace the intuitive, snap-judgment-based prediction style that Malcolm Gladwell describes in Blink. This is because their theories about the world function as predictive models, and they easily forget that the real world doesn’t always conform to any given model. For example, as Tetlock notes in the interview, if a hedgehog believes that rising global powers always come into conflict with reigning powers, they’re more likely to predict that the United States and China will go to war because that prediction fits their model.)

Foxes

Foxes, on the other hand, are “eclectic experts” who have a wide range of analytical tools at their disposal rather than a single Big Idea. The authors argue that this allows foxes to be more flexible, changing their approach based on the particular problem.

Foxes approach new information with a blank slate, allowing the data to shape their interpretation rather than the other way around. Because they are less clouded by bias, foxes tend to seek out information about a situation from all possible sources, including those they don’t personally agree with. This allows them to consider the problem from all angles, creating a more holistic picture of the situation and reducing the likelihood that they’ll fall back on reflexive biases to inform their predictions.

Foxes Make Better Forecasters, Hedgehogs Make Better CEOs

A fox mentality is not always preferable to a hedgehog mentality. In fact, in some fields, hedgehogs have a distinctive advantage. For example, while researching for the book Good to Great, author Jim Collins and his research team interviewed leaders of companies that vastly outperform other companies in their respective industries. Collins found that the CEOs of great companies were overwhelmingly hedgehogs, not foxes. In fact, the hedgehog mentality was so crucial to success that these companies tended to organize their entire business models around the hedgehog’s one Big Idea.

Based on what we’ve learned so far about foxes and hedgehogs, this might be somewhat counterintuitive. If foxes make better forecasters, and CEOs need to be able to predict the outcomes of their business decisions, shouldn’t CEOs strive to be foxes? Collins argues that a fox mentality considers too many variables, which ultimately distracts leaders from pursuing the company’s core mission (or, as Collins calls it, their “Hedgehog Concept”). Instead, strong business leaders should boil down the answers to three questions: What can I do better than anyone else? What is my financial engine? And what am I most passionate about? The combined answers to these questions form the core of the Hedgehog Concept, and pursuing that central concept—without getting distracted by outside factors—is what takes a company from good to great.

The Need for Reliable Forecasts

Tetlock’s discovery of “foxes” in the EPJ had the potential to revolutionize forecasting, which would be especially useful for the intelligence community (IC). According to the authors, the IC previously resisted efforts to quantify forecasts or train forecasters in new methods. Their forecasters were professionals with impressive resumes and top security clearance—in the eyes of IC officials, interfering in their methods would amount to fixing something that wasn’t broken.

Except that it was broken. An example of this is how the IC handled the 2002 proposal that the United States invade Iraq. Top officials believed the Iraqi government was stockpiling weapons of mass destruction (WMDs), and the IC was tasked with evaluating the evidence. Their report corroborated the Bush administration’s claims: The Iraqi government was producing and storing WMDs. However, when the United States invaded Iraq in 2003, they found no WMDs; the IC had made a mistake.

Tetlock and Gardner argue that the mistake was not in the conclusion but in the level of certainty. All the evidence pointed to the presence of WMDs in Iraq—but none of it conclusively. Somehow, experienced analysts making an incredibly high-stakes decision failed to realize they were jumping to conclusions and that the evidence could be interpreted another way. If they had, they likely still would have come to the same conclusion, but certainly not “beyond a reasonable doubt." We can’t know for sure, but it’s possible that a degree of reasonable doubt would have been enough to stop Congress from authorizing the invasion.

Did Survivorship Bias Contribute to the Iraq War?

Tetlock and Gardner attribute the IC’s 2003 failure to overconfidence in their own predictions. However, it’s possible that the IC also fell victim to another common cognitive shortcoming: survivorship bias. As Nassim Nicholas Taleb argues in Fooled by Randomness, survivorship bias is the tendency to overvalue available evidence and undervalue missing evidence. In this case, the authors of the October 2002 National Intelligence Estimate (the same report that concluded Iraq must be stockpiling WMDs) also said, “We lack specific information on many key aspects of Iraq’s WMD programs.” While the IC analysts were aware that they didn’t have all the evidence and therefore couldn’t see the whole picture, they implicitly assumed that the missing evidence would not change their ultimate conclusion.

Additionally, when the IC analysts looked at the information they did have, they concluded that Iraq having a strong WMD program was the most plausible explanation to explain the facts. It’s possible that they mistook “absence of evidence” of other viable explanations as “evidence of absence” and therefore concluded that their story must be the correct one.

IARPA

As the authors describe it, the magnitude of the Iraq failure rocked the IC to its core. Clearly, they needed to generate more accurate forecasts—but how? To answer that question, the IC created a research arm called the Intelligence Advanced Research Projects Activity (IARPA). But IARPA quickly hit a snag: Useful research requires data, and they had no way to measure forecaster accuracy or track their methods.

To address this, IARPA officials approached Tetlock and his partner, Barbara Mellers, for help creating a forecasting tournament that would identify superforecasters and give researchers insight into their methods. Unlike the earlier EPJ tournament, forecasters would make predictions about events months into the future, not years, since forecasting accurately more than a year out is almost impossible. (Shortform note: Tetlock and Gardner assert that the one year time limit on accurate forecasting is scientifically well-established; however, in recent years, official superforecasters have begun predicting events up to 10 years into the future.)

The IARPA tournament was designed for researchers to take advantage of the statistical tools discussed in Chapter 3. Armed with a much larger sample size, they could finally evaluate forecaster accuracy and compare forecasters to each other. (Shortform note: Daniel Kahneman, author of Thinking, Fast and Slow, has since implied that IARPA’s interest in forecasting was sparked, at least in part, by Tetlock’s work in Expert Political Judgment.)

Regression to the Mean

Over the first two years of the tournament, Tetlock identified more “foxes” who scored better than 98% of the group. He dubbed them “superforecasters.” These superforecasters’ skills not only held up from one year to the next; they actually improved. This is a surprising result because of a statistical concept called “regression to the mean” (sometimes called “reversion” to the mean), which is the idea that, with enough trials of a task, outliers will shift toward the mean. In other words, if someone performs extremely well or extremely poorly on a task compared to other people, their scores will probably inch toward the mean score if they try the task again.

The fact that most superforecasters’ scores didn’t regress toward the mean suggests that something is skewing the data. The authors think this happened because forecasters who did exceptionally well in year one were given the “super” designation and put on teams of other superforecasters for year two. It’s possible that this recognition provided a sense of accomplishment that inspired forecasters to work even harder—in the second year, superforecasters raised their individual scores enough to offset the usual regression.

The authors argue that measuring regression to the mean is important for another reason: It allows us to measure how much of superforecasters’ success is due to luck and how much is due to skill. That’s because skill-based scores regress slowly, but luck-based scores regress quickly. After several years in the tournament, researchers determined that the scores of about 30% of superforecasters regress toward the mean each year, while the other 70% remain “super." If forecasting accuracy were purely a matter of luck, we’d expect 100% of superforecasters to regress to the mean over time; therefore, the authors conclude that superforecasting involves a fair amount of skill. (Shortform note: Tetlock has argued that, depending on the specific question, forecasting can fall anywhere between chess (where success is all about skill) and roulette (where success is all about luck) on the skill/luck continuum.)

Calling Forecasters “Super” Improves Their Performance

The fact that superforecasters perform better after being given the “super” label is an example of another form of cognitive bias: the Pygmalion effect, which is the idea that having high expectations for someone can actually improve their performance. You may have noticed this yourself—when a teacher or boss expresses their belief in your skill, you might feel extra motivated to live up to that impression and work harder than you would otherwise. On top of that, their belief in you might cause your teacher or boss to provide you with more support and resources. The combination of these factors ultimately leads to better performance.

However, high expectations don’t always lead to improved performance. In fact, as psychologist Carol Dweck explains in Mindset, praising someone for their ability can often cause them to perform worse on subsequent tasks. The fact that superforecasters improved after being labeled “super” may actually have to do with the fact that superforecasters typically have a growth mindset, which we’ll explore in detail in Chapter 8.

Exercise: Are You a Hedgehog or a Fox?

By nature, we all lean more toward either fox or hedgehog thinking. This exercise will help you reflect on the way you think and identify biases that might be holding you back.

Part 2: Superforecaster Traits | Chapter 5: Superforecasters Think Like Foxes

The authors concede that it would be reasonable to assume that superforecasters are just a group of geniuses gifted at birth with the power to see the future. Reasonable, but wrong. What makes superforecasters truly “super” is the way they use their intelligence to approach a problem. In the next few chapters, we’ll explore the specific mental tools that make superforecasters so accurate.

They Avoid Cognitive Biases

According to the authors, the reason superforecasters make such accurate predictions is that they’re adept at avoiding cognitive biases. We’re all prone to certain cognitive biases that stem from unconscious thinking, or what psychologists like Tetlock often describe as “System 1” thinking. These biases skew our judgment, often without us even noticing. Superforecasters constantly monitor and question their System 1 assumptions. (Daniel Kahneman describes the two-system model of thinking in Thinking, Fast and Slow; System 2 governs conscious, deliberate thinking, while System 1 functions automatically and unconsciously.)

(Shortform note: Kahneman argues that System 1 is also prone to making snap predictions. For example, when we see someone with an angry expression, System 1 automatically predicts that person is about to start yelling. Those automatic predictions could be another mental trap for prudent forecasters to avoid.)

Availability Heuristic

The availability heuristic is a form of bias that was formally discovered and named by researchers Daniel Kahneman and Amos Tversky to describe the automatic process of making snap decisions based on memorable experiences. The authors argue that this is an adaptive evolutionary trait. For example, if hearing the snap of a twig on the savannah brings to mind a memory of a lion pouncing on its prey, you automatically conclude that a lion must be the source of the sound and that you are in danger. If you’ve never witnessed or heard of a lion attack, you’ll interpret the sound differently (and not realize the danger you’re in). The availability heuristic plays out unconsciously, in fractions of a second.

In the context of formal forecasting, the availability heuristic might come into play if a typical forecaster is asked to predict an event they have some previous mental association with. For example, if a forecaster who lived in New York City during the 9/11 terrorist attacks was asked to predict the likelihood of a terrorist hijacking an airplane, they may unknowingly predict a much higher likelihood because the question calls up visceral memories of 9/11.

(Shortform note: This is similar to the concept of frugality that author Malcolm Gladwell describes in Blink. Gladwell argues that the unconscious mind is “frugal”—that is, it automatically lasers in on the most significant details of a situation and ignores everything else. In the lion example, the unconscious mind automatically connects the dots between the sound of a twig snapping and the most significant possible meaning (a lion approaching). It temporarily ignores all other possible explanations that wouldn’t have life-or-death consequences.)

Confirmation Bias

Before making predictions, we often search for evidence. According to Tetlock and Gardner, the problem here is our tendency to latch onto the first possible explanation we think of, then only seek out evidence that supports that belief (and ignore evidence that contradicts it). This is confirmation bias, and it’s a dangerous trap for forecasters. (Shortform note: Confirmation bias is also known as “motivated reasoning.”)

Cherry-picking evidence can quickly lead to drawing conclusions that are completely off base. For example, if a typical forecaster instinctively assumes an event will happen, they may seek out evidence that supports that prediction and downplay evidence that suggests the event is unlikely to happen.

Confirmation Bias Has Dangerous Consequences—Like Medical Misdiagnosis

Confirmation bias (like all biases) is particularly dangerous in high-stakes fields like medicine. For example, medical students are often taught the aphorism, “When you hear hoofbeats, think horses, not zebras”; in other words, that the most common explanation is probably the right one. However, when doctors are expecting a medical “horse” (such as the common cold), confirmation bias makes them more likely to ignore evidence that that horse is actually a zebra (such as an autoimmune disorder). Not having an accurate diagnosis can prevent people from accessing helpful (or even life-saving) treatments.

“Tip-of-Your-Nose” Perspective

Each of us sees the world from the “tip of our nose”—our own perspective. According to the authors, this view is highly subjective and informed by our own unique experiences. The things in your environment that automatically stand out to you as significant will be completely different from the things that automatically stand out to someone else. Forecasters can fall prey to this bias when they over-rely on information from their own field. For example, if a typical forecaster who works as an economist was asked to predict the likelihood of a certain country’s economy collapsing, they might put too much emphasis on the country’s economic history and ignore the impact of social and political factors.

The Limits of System 2

The tricky thing about all of these biases is that, according to Kahneman, we’re much less likely to catch our own biased thinking when System 2 is already taxed. That means that if you’re already thinking hard about something else (or you’re tired from thinking hard earlier in the day), System 1 is likely to take over because you’ve already exhausted your System 2 resources. That presents a challenge for superforecasting, which requires intense cognitive effort and can quickly deplete System 2. In turn, this challenge presents a paradox: It’s important for superforecasters to avoid cognitive biases, but the act of superforecasting itself might make them more prone to cognitive biases if they do too much at a time.

They Generate Multiple Perspectives

Superforecasters also rely on aggregated judgment (aggregation is the process of combining data from multiple sources). Tetlock and Gardner argue that aggregation is a powerful tool for forecasters because the aggregated judgment of a group of people is usually more accurate than the judgment of an average member of the group. This phenomenon is often called “the wisdom of crowds," named for the 2004 book that popularized the idea.

Superforecasters take advantage of the wisdom of crowds by pulling from many sources and using many tools to produce an aggregate answer, despite being just one person. This skill doesn’t come naturally to most people—we struggle to even recognize that there are other perspectives beyond our own “tip-of-the-nose” view, let alone fully consider those ideas. This is part of what sets superforecasters apart.

How to Generate New Perspectives Like a Superforecaster

To think like a superforecaster, you need to look beyond the tip-of-your-nose view and find new ways of viewing a problem. Here are some tips to get you in the right mindset:

They Use Fermi Estimation

According to the authors, another mental trick that sets superforecasters apart is their ability to break down a complex problem, even without having all the information. One form of this technique was popularized by Enrico Fermi, a Nobel Prize-winning Italian American physicist who also worked on the Manhattan Project. Fermi’s approach was to break down seemingly impossible questions into smaller and smaller questions. The idea is that eventually, you’ll be able to separate questions that are truly unknown from questions for which you can at least make an educated guess. Tetlock and Gardner believe that, added together, a handful of educated guesses can get you remarkably close to the correct answer to the original question.

Example Fermi Problem

It’s much easier to understand this process with an example, such as “How many oil changes are performed in one day in the U.S.A.?” Unless you happen to work at Jiffy Lube, you probably have no idea how to even begin to answer this question. At best, you might take a wild guess. But Fermi’s approach is more methodical.

To answer this question, we need to determine what information we’re missing. What would we need to know to figure this out? The number of oil changes probably depends on the number of cars in the country and how often those cars need an oil change. So, to figure out how many oil changes happen in a given day, we need to know:

  1. The number of cars in the U.S.
  2. How frequently cars need an oil change

If we happened to know the answers to either of those questions, great! But if we don’t, we can break them down even further into questions that we can answer. Let’s do that now.

To figure out the number of cars in the U.S., we need to know:

  1. The population of the U.S.
  2. The percent of the population that owns a car

To figure out how frequently cars need an oil change, we need to know:

  1. How many miles cars can go between oil changes
  2. How many miles the average car is driven in a year

We may still need to wildly guess the answers to some of these questions. However, because those wild guesses are only for smaller parts of the overall question, we’re still likely to be more accurate than if we’d just thrown out an answer to the original question.

Solving the Problem

To solve this problem, we’ll need to come up with some estimates for each of those questions. (Shortform note: Fermi questions are commonly used in job interviews. To get the most out of this exercise, assume you’re in the middle of an interview and can’t look up the real answers, so you’ll need to rely on your best guess.)

  1. First, we need to know the population of the U.S. You might have a rough idea of this answer—it’s about 320 million people.
  2. Next, we need to know how many of those people own a car. This question isn’t as clear cut, so we can break it down further. Let’s assume that roughly 75% of the population can drive, which gives us 240 million drivers.
    1. However, not everyone who can drive owns their own car. Let’s assume there’s an average of two drivers in a household, and one car per household. That’s 120 million cars.
  3. If you have a car, you might remember that you’re supposed to change the oil after about 3,000 miles of driving. However, people are busy, and oil changes might not always be a priority. Let’s assume that the average car owner gets their oil changed after about 3,500 miles of driving.
  4. Now, we need to know how many miles the average car is driven in a year. This is a tough one, so let’s take a wild guess of 10,000 miles per car, per year.

If each car owner gets an oil change after 3,500 miles of driving and drives 10,000 miles per year, they get about three oil changes per year. (Shortform note: In Fermi estimation, there’s no need to stress over the exact math. Feel free to round numbers up or down to make the math easier since the end goal is a very rough guess, not an exact answer.)

Three oil changes per year for 120 million cars is 360 million oil changes per year, total. However, the question asked about the number of oil changes in one day, not one year, so we need to divide this further. We know there are 365 days in a year, but most auto shops aren’t open seven days a week, so let’s bring that down to 320 days per year in which oil changes are being performed. To get our final answer, we divide 360 oil changes per year into 320 days, which equals about 1.1 million oil changes per day in the U.S.

What’s the real answer? Industry data shows that about 450 million oil changes happen per year in the United States. If we keep our estimate of 320 working days per year for oil changes, that’s 1.4 million oil changes per day, which is incredibly close to our rough estimate!

Tips for Improving Fermi Estimations

As we’ve previously noted, Fermi problems are often used during job interviews to evaluate a candidate’s critical thinking skills. For instance, Google often asks interviewees subject-specific Fermi questions, such as “How many photos are taken on Android phones per day?” If you want to practice this technique, here are some tips to improve your estimation skills:

They Seek Out Intellectual Challenges

Tetlock and Gardner argue that breaking down questions like Fermi, synthesizing outside and inside views, and aggregating different perspectives may not be natural talents, but they are incredibly mentally demanding, and superforecasters do them for no profit. That tells us that there is something fundamentally different about superforecasters’ brains—not that they can do all that mental work, but that they want to.

According to the authors, people who volunteer for forecasting tournaments are the type who enjoy mental challenges. They’re more likely than others to spend their spare time doing crossword puzzles or reading dense books. In psychology, this is called having a high “need for cognition,” or the internal motivation to seek out cognitive challenges.

High Need for Cognition Is Related to Bias

Superforecasters are “super” largely because they are so skilled at avoiding cognitive biases. However, superforecasters are also almost universally high in need for cognition (NFC), which can make them more prone to certain kinds of bias. For example:

How are superforecasters so good at avoiding biases when their high NFC should make them even more biased? It’s possible that working in teams helps superforecasters overcome their natural biases. We’ll learn more about how superforecasting teams help each other avoid bias in Chapter 9.

Exercise: Answer the Ball-and-Bat Problem

We all use both System 1 and System 2 every day. This exercise highlights the way you draw on each of these in different ways.

Exercise: Generate New Perspectives

Thinking like a superforecaster requires coming up with new perspectives. Let’s practice creating multiple perspectives on a single issue.

Chapter 6: Superforecasters Think in Probabilities

Despite all the emphasis on technique over talent, the fact that superforecasters work with such minute numerical details might lead you to believe that superforecasters are secretly an elite class of mathletes. According to the authors, you wouldn’t be completely wrong—superforecasters are almost universally skilled with numbers. What is surprising about that is that forecasting very rarely requires higher-level math skills.

(Shortform note: As an example, one superforecaster argued that he and his teammates never explicitly use Bayes’ rule (a mathematical equation for updating predictions) in their forecasts. However, Tetlock countered that while they may not actually whip out the equation for any particular problem, superforecasters are numerate enough to understand the principles of Bayes’ theorem better than most people. We’ll learn more about superforecasters’ Bayesian thinking in Chapter 7.)

Probabilistic Thinking

So what does superforecasters’ superior numeracy have to do with their success? According to the authors, superforecasters are probabilistic thinkers. This goes beyond just phrasing their forecasts in terms of probability percentages. Essentially, in situations where most of us think in terms of black and white, superforecasters think in shades of grey.

Most people’s mental “probability dial” has three distinct settings: yes, no, and maybe. By contrast, probabilistic thinkers have an unlimited number of settings. They’re more likely to answer questions in terms of percentages rather than “yes” or “no.” And this is not just in the realm of forecasting—this is how superforecasters normally think and speak in their everyday lives.

(Shortform note: Superforecasters’ emphasis on probabilistic thinking may help to explain the gender gap among superforecasters, who tend to be male. Young children perform about the same on tests of probabilistic thinking regardless of gender—however, by age 10, boys tend to outperform girls, a pattern that holds true for many other math skills. This could be because, as Sheryl Sandberg argues in Lean In, social norms discourage girls from pursuing math and science.)

The Two- (or Three-) Setting Dial

According to Tetlock and Gardner, there is a good reason that most of us are not natural probabilistic thinkers. For most of human history, the three-setting dial was reduced even further to two settings (“yes” or “no”). For our ancestors, this was an advantage. Early humans lived in a world where predators were a constant threat—but our brains and bodies aren’t designed for perpetual vigilance, and stress wears us down over time. Snap judgments became an evolutionary life hack: While the probabilistic thinkers were fretting over the likelihood that a strange noise came from a predator, the concrete thinkers had already landed on an answer and responded accordingly.

(Shortform note: As Daniel Kahneman describes in Thinking, Fast and Slow, this evolutionary mechanism also partly explains why we have such trouble understanding randomness. Our ancestors saved time and mental energy by making snap judgments about whether two things are related—like whether a sound is related to a predator’s presence, or whether seeing predators out hunting is related to the fact that it just rained. Noticing connections (like, “the lions are more active after it rains”) allows us to predict and avoid future threats (in this case, by not going out after a rainy day lest we encounter a hungry lion). Assuming a causal pattern keeps us safe, even if those patterns are really the result of random chance.)

But what about “maybe”? For life-or-death decisions, “maybe” is not particularly helpful. The authors believe that most of us use “maybe” as a last resort, only when the odds are roughly even and the stakes are low. The uncertainty that comes with a “maybe” answer is intuitively unsettling, possibly because we have evolved to associate uncertainty with danger. We settle for maybe only when we’re forced to—usually for probabilities roughly between 40% and 60%. Anything higher is “yes," anything lower is “no.”

Two- and three-setting mental dials helped our species survive into the modern era, and they are still helpful when snap decisions are necessary. But thankfully, most of the judgments we make on a daily basis are not immediate, life-or-death decisions. For everything else, the most accurate answer is “maybe.” This is where probabilistic thinkers shine—they see “maybe” not as an answer in itself but as an infinite range of possible answers.

Learn to Embrace “Maybe”

Thinking probabilistically can help you make better decisions. However, it’s often easier said than done because this method of thinking requires accepting uncertainty, which is new territory for many of us.

In Thinking in Bets, former professional poker player Annie Duke lays out a strategy for getting comfortable with uncertainty. Duke recommends thinking of each decision as a bet with real money on the line. This trick makes the consequences of being wrong feel more concrete, which can help motivate you to double-check your logic and think about all the possible sources of uncertainty.

Placing an imaginary bet can be helpful even when we already have a numerical probability prediction. For example, imagine you check the weather forecast and see that there’s an 80% chance of rain. Under normal circumstances, you might interpret that as “it will definitely rain” because 80% is higher than chance. However, if you ask yourself, “Would I bet $500 that it will rain?”, you may feel a lot less confident in that prediction; as a result, you’re more likely to grapple with the uncertainty of the forecast instead of just assuming that an 80% chance is a definite “yes.”

The Hunt for bin Laden

The authors describe a historical example of the tension between probabilistic and concrete thinking. In 2011, the intelligence community identified a compound in Pakistan that they suspected housed Osama bin Laden, the terrorist responsible for the 9/11 attacks. There was enough evidence to suggest that bin Laden was in the compound—but not enough for analysts to be completely certain. Leaders from several IC agencies gathered to debate the evidence and come up with a forecast that would ultimately be presented to President Barack Obama, who would decide whether to authorize a raid on the compound.

Although different accounts of that historic conversation vary on the details, the authors report that multiple IC analysts presented their personal confidence levels to the president based on the data available. These confidence levels ranged from 30% to 95%. Averaged together, the group was around 70% certain that the man in that compound was bin Laden. President Obama responded, “This is fifty-fifty. Look guys, this is the flip of the coin.”

Given the context, it’s unlikely that the president meant there was a literal 50% likelihood of bin Laden being in the compound. Instead, the authors believe he used “fifty-fifty” as a synonym for “maybe.” This illustrates the fundamental difference between probabilistic and concrete thinking: While the analysts were adjusting percentages, the president focused on accepting the inherent uncertainty of the situation and moving on. His three-setting dial was set to “maybe,” and without enough evidence to move it to “yes” or “no,” that dial was no longer useful to him.

Barack Obama Explains His Decision-Making Process

In his 2020 memoir, A Promised Land, former president Barack Obama gave his own account of the fateful decision to authorize the raid. Tetlock and Gardner were right to conclude that Obama’s “fifty-fifty” comment was his way of acknowledging the inherent uncertainty of the situation—he knew that, despite the intelligence community’s diligence, they could never be 100% certain that the tall man in the compound really was bin Laden.

As a result, Obama switched his focus from the odds of success to the possible consequences of failure. Then-vice president Joe Biden was strongly opposed to the raid; he’d been in Washington in 1980 and witnessed the intense fallout from Operation Eagle Claw, the failed attempt to rescue American hostages being held in the U.S. embassy in Tehran. Eight service members were killed in the attempt, and President Carter’s political career never recovered. On top of that, the bin Laden raid was to be conducted without the knowledge or consent of the Pakistani government; if it failed, it would be a major diplomatic disaster.

Ultimately, Obama weighed these risks against the risk of not authorizing the raid: namely, squandering the best lead the intelligence community had had on bin Laden’s whereabouts in a decade. He decided that, despite the unreliable odds, the chance to take out the criminal responsible for the 9/11 terrorist attacks was worth the risks. The mission was successful: On May 1st, 2011, a Navy SEAL team raided the compound and executed Osama bin Laden.

Exercise: Embrace Probabilistic Thinking

Most of us don’t naturally think like superforecasters. This exercise is a chance to examine your own thinking style and practice embracing uncertainty.

Chapter 7: Superforecasters Start With a Base Rate, Then Update

In this chapter, we’ll discuss another technique that the authors argue superforecasters use: the “outside-in” approach to forecasting. We’ll also discuss how superforecasters update their predictions based on new information they encounter after making their initial forecast. In IARPA-style tournaments, forecasting questions typically remain open anywhere from a few weeks to a few months, and forecasters can adjust their forecasts as often as they like during that window. The authors argue that these adjustments are crucial since a forecast that doesn’t take all the available data into account will likely be less accurate than an up-to-date forecast. (Shortform note: In Smarter Faster Better, author Charles Duhigg interviews poker master Annie Duke, who argues that updating her beliefs about her opponents (instead of sticking to her initial assumptions) helps her avoid prejudice, which in turn makes her less likely to underestimate her opponents.)

They Think From the Outside In

Tetlock and Gardner argue that when superforecasters first encounter a question, they begin by looking at the wide perspective of that question before accounting for the specifics (in Thinking, Fast and Slow, Daniel Kahneman calls that wider perspective the “outside view.”) Compare this to the “inside view,” which describes the particular details of a situation.

By nature, our storytelling minds gravitate toward the inside view. Statistics are dry and abstract—digging into the nitty-gritty details of someone’s personality is much more exciting. But that natural tendency can quickly lead us astray. If we’re told that Dr. Jones loves children and worked at a summer camp for sick children during college, we might say it’s 80% likely that Dr. Jones is a pediatrician. On the other hand, if we’re told that Dr. Jones is a very serious, reserved person and has no plans to become a parent, we might swing to the other extreme and guess 2%.

According to the authors, the problem with this practice is that we have no way of knowing how extreme those answers are. For that, we need a base rate to give us an idea of how common it is to specialize in pediatrics in general. In reality, only about 6.5% of doctors specialize in pediatrics. A guess of 2% is closer to the mean than a guess of 80%, which means that an 80% guess is more likely to be wrong.

But why does it matter which view we start with? If we’re going to adjust our initial outside-view guess based on information from the inside view, wouldn’t the reverse give the same answer? The authors argue not, due to a psychological concept called anchoring. The number we start with has a powerful hold on us, and we tend to under-adjust in the face of new information.

For example, in the pediatrician question, if we start with an inside view guess of 80%, then move to an outside view of 6.5%, anchoring means we’d bump our original number down, but only slightly—maybe to 50%. In this case, our anchor came out of thin air and significantly skewed the final results. But if we start with an anchor of 6.5% (outside view), then move to an inside view and see that Dr. Jones doesn’t seem like the type to specialize in pediatrics, we might adjust that number down to 3%; this outside-in guess is far more likely to be accurate.

(Shortform note: Anchoring is an incredibly powerful form of unconscious bias. In Predictably Irrational, author Dan Ariely describes how we can unknowingly anchor to a number as irrelevant as our social security number. In one experiment, researchers asked people to write down the last two numbers of their social security number before bidding on items in a silent auction. Participants with higher digits in their social security numbers were willing to bid significantly higher for each item than participants with lower digits in their social security numbers.)

Master the Outside View With a Premortem

In Thinking, Fast and Slow, Daniel Kahneman advises using a “premortem” analysis to avoid the dangers of inside-out thinking. A premortem analysis is a mental exercise in which you imagine that whatever you’re working on (be it a project or a forecast) has already come to fruition—and was a complete disaster. Your goal is to come up with as many reasons as possible to explain this hypothetical “failure.”

This approach is helpful because, by nature, the inside view makes a situation feel “special”—it predisposes you to focus on what makes the situation unique. That feeling can make it more difficult to notice biases in your prediction because you might assume the current situation won’t abide by the usual “rules.” For example, most newlyweds probably don’t expect to ever get divorced, despite the 40-50% divorce rate. That’s because, from the inside, the relationship feels “special” or distinct from the relationships that ended in divorce.

The premortem technique can help you reorient to the outside view because assuming your answer is incorrect will likely force you to recognize that the specifics of this situation aren’t as important as the base rate. For example, if you’re predicting whether a startup will succeed, it’s tempting to take the inside view and make your forecast based on the business model or the founder’s previous business experience. However, if you try a premortem analysis, it will be easy to come up with reasons the company failed given that the failure rate for startups is roughly 90%. That sobering statistic can help remind you that even if the inside view looks like a recipe for success, the odds are stacked so strongly against new businesses that failure is much more likely.

Underreaction

Underreacting to new information is often the result of a cognitive bias sneaking into the equation. Even superforecasters are not immune to this, despite their natural self-awareness and ability to suspend judgment until they have the facts. In particular, attribute substitution (or “bait and switch”) often slips under the radar, particularly for questions about an individual person’s decisions. In that situation, we can’t help but mentally insert ourselves, and the question changes from “What will this person do in this situation?” to “What would I do in this situation?”

For example, imagine you’re predicting whether your boss will hire a certain job candidate. If you met the candidate and thought they were a terrible fit for the position, you might automatically assume your boss will share that impression and turn them down for the job. Even if you hear your boss speaking highly of that person after interviewing them, you may be so stuck on your own prediction that you underreact to the new information (that your boss was impressed by the candidate) and fail to update your prediction.

Stereotypes Are a Form of Attribute Substitution

When making predictions about individual people, we may also unconsciously rely on stereotypes, which are another form of attribute substitution. For example, imagine an American school administrator is trying to predict which of two students in their school will perform better on a standardized math test. If the administrator holds an unconscious stereotype that boys are better at math, they may automatically assume that the male student will outperform the female student because they’ve substituted the easy question, “Is this student male?” for the harder question, “Is this student good at math?”

Now, imagine the administrator gets a new piece of information: The female student is an exchange student from Estonia, a country that vastly outperforms the United States in measures of math education. Depending on the strength of the administrator’s unconscious stereotype, they may underreact to this new piece of information and still predict that the male student will do better on the test.

Overreaction

At the other end of the spectrum of belief updating is overreaction. According to the authors, the most difficult part of updating forecasts is deciding what new information is relevant and what is just noise. In some situations, it’s easy to tell the difference. For example, imagine you predicted that the U.S. president would veto a controversial new law. The next day, you learn that the president is an avid baker. Would you change your forecast?

In theory, the president’s taste in hobbies seems irrelevant to the question, so it makes sense to ignore it and keep your forecast the same. But in practice, we’re far less logical. Tetlock and Gardner argue that the more information we have about someone, the less confidently we can predict anything about them, even when that information is completely irrelevant. This is called the dilution effect, where adding new information dilutes the perceived importance of every piece of information. Irrelevant information makes us less confident in our beliefs because it rounds out the situation and makes it harder to categorize. In the example above, you might overreact to the information about the president’s baking hobby by updating your forecast because the new information makes it harder to decisively categorize the situation.

(Shortform note: There are several ways to use the dilution effect to your advantage. For example, if a family member holds sexist beliefs about women in power, you might tell them stories about your female boss’s likes and dislikes, hobbies, and quirks to round out their mental image of her and possibly lessen the strength of their sexist beliefs. On the other hand, if you want to persuade someone to do something, stick to one or two strong justifications for it—if you provide too many, you could dilute the strength of your argument, even if your reasoning is sound.)

Striking a Balance

According to the authors, superforecasters make a lot of forecasts. A superforecaster may update a single prediction as many as 77 times in the three-month open window of a question. The most accurate superforecasters update more often and in smaller increments than their peers, who are more likely to under- or over-react to new information. But how do they know how much to update by? While superforecasters don’t often use actual math in their forecasts, the authors argue that the Bayesian belief-updating equation is a helpful way to think about how much a piece of new information should affect your original forecast. In mathematical terms, the theorem looks like this:

P(H|D)/P(-H|D) = P(D|H)/P(D|-H) • P(H)/P(-H)

Posterior odds = Likelihood ratio • Prior odds

In plain English, Bayes’ theorem tells us that the new belief (“posterior odds”) should be the product of the old belief (“prior odds”) and the weight of new information (“likelihood ratio”). (Shortform note: In this case, “prior odds'' are typically the “outside view” base rate that we discussed above.) While superforecasters rarely use this theorem to do actual calculations, the principle of adjusting predictions based on the weight of new information is key to their success.

Is it Possible to Be a Bayesian Bigot?

We’ve seen how cognitive biases like attribute substitution can trick us into relying on stereotypes to make decisions. While applying Bayes’ theorem helps forecasters avoid those cognitive biases, it doesn’t fully eliminate the risk of playing into stereotypes. That’s because Bayes’ theorem requires starting with a statistical base rate—which can reflect uncomfortable inequalities.

For example, Black Americans are nearly six times more likely than white Americans to be incarcerated at some point in their lifetime. This is what Tetlock called a “forbidden base rate” in his earlier research on moral responses to Bayesian decision-making (forbidden base rates are “forbidden” because they could be offensive to some groups and/or because they could be used to perpetuate harmful stereotypes). Therefore, if a forecaster used Bayes’ theorem to predict the likelihood that a given person had a criminal record, the forecaster would use that base rate to assign a higher probability if the person in question were Black. The forecaster, then, would be what Tetlock calls a “Bayesian bigot.”

This presents a conundrum: If Bayesian methods are the most rational way to make predictions, what does it mean when they lead us to socially biased conclusions? To answer that, we need to examine the base rate itself. For example, if we assume that Black people are incarcerated at higher rates because they commit more crimes, then it would appear that Bayesian thinking provides a rational justification for the stereotype that Black people are prone to crime.

However, there is another way to interpret the same base rate. In Biased, Dr. Jennifer Eberhardt describes how Black people who have been arrested are more likely to be held in jail until their trial than white people because of racial bias in the cash bail system. Being held in pre-trial detention increases the likelihood of accepting a plea bargain, which results in an official criminal record. With this explanation, we can accept the Bayesian conclusion—“A given Black person is statistically more likely to have a criminal record than a given white person”—without seeing it as evidence for the stereotype linking Black people and crime.

Chapter 8: The Power of a Growth Mindset

By now, we’ve learned that superforecasters are smart, numerate, well-informed, and actively open-minded. All of these traits make them fantastic forecasters, but according to the authors, what sets superforecasters apart from other forecasters more than anything else is their persistent commitment to self-improvement, which we’ll discuss in this chapter.

Growth Mindset vs. Fixed Mindset

Forecasting involves quite a bit of failure because forecasters are asked to predict the unpredictable. While no one enjoys being wrong, the authors argue that superforecasters are more likely than regular forecasters to see their failures as an opportunity to learn and improve. Educational psychologists call this a “growth mindset.” People with a growth mindset believe that talent and intelligence can be developed through learning and practice.

The idea behind the growth mindset seems intuitive, but in practice, the authors report that most of us gravitate towards a “fixed mindset” instead. The fixed mindset tells us that talent and intelligence are traits we’re born with, so practice can only strengthen the natural abilities that are already there.

Superforecasters embrace the growth mindset and aren’t discouraged by failure. Tetlock’s research found that this commitment to constant personal growth was three times more important than any other factor in superforecasters’ success.

Grow Your Own Growth Mindset

Like many other superforecaster skills, a growth mindset isn’t an inborn trait—it can be grown and developed with practice. In Mindset, psychologist Carol Dweck lays out a few concrete tips to help you transition from a fixed mindset to a growth mindset.

Grit

According to the authors, one thing people with a growth mindset tend to have in common is grit, a psychological term for the steadfast perseverance of a goal even in the face of failure. People with grit are persistent. They keep trying, failing, and trying again long after most people would have given up.

Grit is a helpful quality in any field, but the authors believe it is absolutely essential for superforecasters because, as we’ve already noted, forecasting inevitably involves failure—even failing at predictions that took months to prepare and refine. Superforecasters keep trying anyway.

Beyond just continuing to give their best effort, the authors believe that grit distinguishes the way superforecasters make their predictions in the first place. When faced with difficult questions and obsolete data, talented forecasters have been known to reach out to international agencies directly for updated numbers. Superforecasters aren’t afraid to put in the work when it comes to getting the information they need.

How to Develop Grit

Psychologist Angela Duckworth is the leading name in grit research. In her book, Grit, Duckworth breaks down the concept of grit into four necessary components: interest, practice, purpose, and hope. Here’s how to strengthen each of these areas.

Exercise: Develop a Growth Mindset

A growth mindset doesn’t always come naturally, but the more you practice, the easier it gets. This exercise will get you started with pushing past frustration.

Chapters 9-10: Forecasting Teams and Forecaster Leaders

Tetlock and Gardner note that in forecasting tournaments, superforecasters work in teams to create forecasts. The way these teams operate can make or break their predictions. The authors argue that these outcomes are more than the sum of each team member—efficient forecasting teams can reach a level of accuracy that no individual forecaster can reach on their own. (Shortform note: Superforecasters are such natural team players that grouping them into teams reduces the impact of bias on their forecasts even more than specific anti-bias training does.)

In this chapter, we’ll discuss the factors that make a good forecasting team, as well as the attributes that superforecasters have that make them team players.

The Dangers of Groupthink

Tetlock and Gardner argue that one of the biggest challenges for a successful team is to avoid groupthink, or the phenomenon in which well-bonded teams gradually lose sight of critical thinking. When groupthink takes hold, group members no longer challenge each other’s assumptions. Instead, they unconsciously develop a shared worldview, and group members take this unanimity as a sign that the group’s conclusions must be correct.

The authors argue that groupthink usually creeps in unintentionally and can derail even the most well-intentioned teams. Teams with a strong belief in a shared mission might reinforce each other’s beliefs that support the mission and collectively ignore contradictory evidence or opinions, which can skew their forecasts.

(Shortform note: Superforecasting teams might be able to avoid groupthink in part because their collaboration happens online rather than in person. In Quiet, author Susan Cain argues that teams that collaborate digitally are less likely to develop groupthink and more likely to produce quality ideas. This may be because it’s harder for one dynamic personality to steamroll the whole group while interacting online than it would be during an in-person meeting.)

What Makes a Good Team?

Avoiding groupthink is an important part of building a successful team. According to the authors, this sets the group up for success by ensuring that each member of the group forms their own judgments independently and has an opportunity to present their thoughts to the team. This culture of openly sharing ideas is crucial for a successful superforecasting team.

To promote the free exchange of ideas, the authors believe that groups need to foster an environment that is “psychologically safe,” where all team members feel comfortable voicing opinions, challenging each other respectfully, and admitting when they don’t know the answer. For teams with leadership hierarchies, a psychologically safe environment is one in which everyone, regardless of status, feels comfortable offering constructive criticism to higher-ups without fear of repercussions. For example, in a psychologically safe business environment, a rookie employee could comfortably challenge the boss’s idea without worrying about being fired, and a boss can comfortably say “I was wrong” without worrying about losing the respect of her employees.

(Shortform note: Most of the research on psychological safety focuses on hierarchical environments with traditional boss/employee or leader/follower structures. However, superforecasting teams are non-hierarchical because there is no designated leader—every team member is on equal footing. It’s not clear how that lack of hierarchy might change how superforecasting teams create or experience psychological safety.)

Active Open-Mindedness

Tetlock and Gardner argue that creating a psychologically safe environment requires active open-mindedness (AOM), or the deliberate pursuit of new perspectives that challenge your worldview.

Individual superforecasters tend to have above-average AOM scores. However, research shows that throwing a few high-AOM superforecasters onto a team together doesn’t necessarily create a high-AOM team. The authors argue that a team’s AOM score reflects how team members engage with one another when they disagree. A random group of open-minded thinkers won’t be a successful team unless the group as a whole can work well together.

(Shortform note: In Principles: Life and Work, author and successful investor Ray Dalio advises cultivating open-mindedness by assuming you have certain blind spots that others don’t have. In that view, one way for a team to raise its collective AOM score is to have each member share their ideas aloud so that the group can help identify any blind spots and suggest ways to strengthen each other’s ideas.)

How to Create Psychological Safety

So how can group leaders create a sense of psychological safety? The authors advise starting by getting comfortable with frank, respectful discussion rather than dancing around an issue. It’s not enough to say you welcome critical feedback: Every member of the group needs to proactively ask for it from their teammates and genuinely thank those who point out flaws in their ideas. Additionally, group leaders should wait to share their opinions until the rest of the team has chimed in. That way, group members who might be hesitant to contradict the higher-ups will feel free to speak openly.

Other Ways to Create Psychological Safety

As mentioned above, most of the research on increasing psychological safety assumes the group has a traditional hierarchy. In addition to soliciting critical feedback, groups without designated leaders (like superforecasting teams) may need to rely on some non-traditional strategies to promote psychological safety.

In Trillion Dollar Coach, authors Eric Schmidt, Jonathan Rosenberg, and Alan Eagle argue that one way to promote psychological safety is to increase the diversity of the team. This sends a clear signal to every team member that their differences are valued, which ultimately increases their confidence to voice their disagreement with other team members’ ideas.

Another way for non-hierarchical teams to create psychological safety is to prioritize group bonding. In The Culture Code, author Daniel Coyle argues that laughter is a major indicator of psychological safety. Coyle believes that creating a culture of fun within the group (through structured events, casual conversations, or both) promotes bonding and increases psychological safety.

Givers and Takers

According to the authors, another feature of successful teams is the way they freely share resources with each other. Psychologist Adam Grant calls people who give more than they receive “givers.” He compares them to “matchers” (who give and take in equal measure) and “takers” (who take more than they give). Tetlock and Gardner argue that successful superforecasting teams tend to be stacked with givers. This creates a culture of generosity, which contributes to the feeling of psychological safety that is critical for success.

The authors believe that this culture of generosity is especially critical for superforecasting teams because they have no designated leader. Everyone has an equal share of responsibility for the team’s performance. In dysfunctional teams, the lack of structure might result in one person dominating the discussion or being saddled with most of the work. But superforecasting teams avoid these traps surprisingly well because they’re composed of equally dedicated volunteers. Everyone on the team eagerly does their part because they’re natural givers who are passionate about forecasting.

Superforecasting Teams Are Optimally Distinct

In his book Give and Take, Adam Grant offers another clue as to what makes superforecasting teams so prone to generosity. Grant argues that we’re more motivated to help people who are part of our own social and identity groups (which can be anything from immediate family to school classmates to fellow football fans). Additionally, the more unique the group is compared to the dominant culture, the more inclined members are to help one another (this is called “optimal distinctiveness”). Participating in forecasting tournaments is a rare hobby, and superforecasters are a unique subgroup of forecasters; that uniqueness might strengthen their group identity and make them even more likely to share resources with one another.

Can Good Forecasters Be Good Leaders?

We’ve learned that forecasters are often good team players—but are they good leaders? According to the authors, leaders need to be confident, but superforecasting requires the humility to admit when you don’t know the answer and to acknowledge that bias might cloud your judgment. How can one person balance both these attributes, and be both a good forecaster and a good leader? The key is having intellectual humility: acknowledging the power of randomness and that some things are impossible to predict or control, regardless of your skill.

Professional poker player Annie Duke describes this as the difference between “humility in the face of the game” and “humility in the face of your opponents.” Duke’s long record of success indicates that she is an exceptionally talented poker player and is probably more skilled than most of her opponents. But all of Duke’s skill and experience doesn’t mean she will automatically win every game or that she is even capable of fully understanding every possible intricacy. Her skills allow her to beat her opponents but not the game itself.

To Foster Humility, Understand the Role of Luck in Success

Annie Duke’s distinction between “humility in the face of the game” and “humility in the face of your opponents” reflects author Nassim Taleb’s views on luck and success. In Fooled by Randomness, Taleb argues that, while skill is a good predictor of moderate success, luck is a better predictor of wild success (this is especially true in industries that are intrinsically tied to luck, such as investing). Based on Taleb’s ideas, we should remain humble even when we succeed because wild success (the kind that breeds fame and fortune) isn’t evidence of extraordinary skill—it’s evidence that we got really lucky.

Similarly, Duke understands that winning at poker requires a certain degree of luck; if she were extremely skilled but terribly unlucky, she’d be able to carve out a decent record, but she certainly wouldn’t be the champion player she is now. Therefore, Duke is able to remain humble because she understands that no matter how well she plays, she’s always one streak of bad luck away from a loss.

Exercise: Identify Psychological Safety

When everyone feels safe enough to speak their mind, the whole group benefits. Learn how you can apply this to your own life.

Chapter 11: Does Superforecasting Really Matter?

Superforecasting is an impressive skill, and certain superforecaster traits can turn good leaders into great ones. But Tetlock and Gardner caution that human brains are not built for objective analysis. As we’ve noted, even the best superforecasters are not immune to cognitive bias. So what is the point of developing superforecasting abilities if we are always one unchecked assumption away from being completely wrong? In this chapter, we’ll explore that question.

The Superforecaster vs. the Black Swan

According to the authors, the value of any kind of forecasting is predicated on the assumption that it’s possible to predict meaningful future events in the first place. This is not a universally accepted idea.

One strong critic of superforecasting is author and former Wall Street trader Nassim Taleb, who argues that the only truly important events in the course of history have been completely unpredictable. He calls these “black swan events.” (Shortform note: In The Black Swan, Taleb describes another characteristic of black swan events that Tetlock and Gardner don’t address: the fact that humans struggle to accept the inherent unpredictability of black swans. In the aftermath of a black swan event, people often try to argue that the event was, in fact, predictable, despite the fact that no one successfully predicted it.)

If Taleb is right, there is no point to developing forecasting skills, because the only events that matter are the ones that cannot possibly be predicted.

So is forecasting a fool’s errand? For that to be true, the authors argue, we need to accept both of Taleb’s conclusions: that black swan events are literally impossible to predict, and that only black swan events change the course of history. Let’s explore both of these conclusions in detail.

Are “Black Swans” Totally Unpredictable?

If the term “black swan” only describes truly unpredictable events, then the authors believe there have been very few actual black swans in history. The most commonly cited black swan example is the 9/11 terrorist attacks. But even 9/11 was not completely impossible to predict—similar attacks had been thwarted in the past, and the intelligence community was actively examining the threat.

In practice, that evidence means that while it would have been extremely difficult to predict the exact date and time of the 9/11 attacks, it was entirely possible to predict a terrorist attack in which a plane was turned into a flying bomb, and several people actually did make this prediction before the event.

(Shortform note: Taleb would disagree that 9/11 was at all predictable. In fact, in The Black Swan, he argues that the 9/11 attacks weren’t just unpredictable—they happened because they were unpredictable (because if we’d been able to predict the attacks, we’d have prevented them happening in the first place). His concept of “successfully predicting” black swans seems to rely on predicting them exactly, including the specific time, place, and people involved.)

But what if we take “black swans” to mean, in Taleb’s words, “highly improbable consequential events,” rather than events that can’t possibly be predicted? In that case, Taleb’s logic becomes easier to swallow. The authors argue that black swans, even defined more loosely, are still incredibly rare by definition, and it could take hundreds or even thousands of years to generate the amount of data that would allow us to calibrate how accurately we can predict them. In that sense, Taleb is correct—trying to predict black swan events is pointless.

Black Swan Predictions Get Lost in the Noise

We’ve seen how many people predicted an event similar to the events of September 11, 2001. In the face of that evidence, we could argue that the issue is not the pure inability to predict these events—the issue is that even if a few people do predict a black swan, there’s no real reason to listen to those particular people over the millions of others predicting things that will never happen. Data scientists call this “noise,” which is all the extraneous information that obscures the “signal” that we need to focus on. In Fooled by Randomness, Taleb describes how noise makes it nearly impossible to successfully identify which signals will end up being relevant in the future. In other words, the dots are there, but we can only connect them after the fact, once we know how it all works out.

Are “Black Swans” the Only Events That Matter?

The second part of Taleb’s logic is that only black swan events change the course of history. This claim is much easier to dispute, as we have evidence that non-black swan events have changed the world—the gradual development of technology and slow growth of the global economy have had enormous consequences. Improved technology has led to advances in medicine, hygiene, infrastructure, and so on—all undeniably important developments with no single black swan cause.

COVID-19: The “White Swan” That Changed the World

The COVID-19 pandemic blew a massive hole in Taleb’s theory that black swans are the single most important events in human history. While some people have described the pandemic as a black swan, Taleb argues that the COVID-19 pandemic is an archetypal white swan because so many people warned of the exponential risks of a pandemic in an age of global connectivity—including Taleb himself in a January, 2020 paper on the topic.

However, while COVID-19 may have been a white swan, there is no doubt that both the pandemic and its social, political, and economic fallout have fundamentally changed many aspects of daily life for people all over the globe. Taleb conceded this point in a 2021 interview in which he predicted how the pandemic will permanently impact certain industries (such as real estate) and how countries with unsuccessful pandemic responses will struggle in the future. In other words, the pandemic is unimpeachable evidence that predictable events can be even more impactful than black swans.

The Consequences Are Predictable, Even if the Event Isn’t

Tetlock and Gardner believe that it’s also important to consider where we draw the boundaries of an event itself. For example, we don’t talk about the 2008 financial crisis as a black swan event just because it was unpredictable—unpredictable things happen all the time without earth-shattering ramifications. But the 2008 financial crisis marks a turning point in world history because it launched a worldwide economic recession. In other words, what makes black swans so important is not just the event itself but the consequences.

This presents a challenge to Taleb’s logic. While the details of the 2008 recession may have been unpredictable, the consequences were not. Superforecasters have made accurate forecasts on questions like the likelihood that a government will bail out certain institutions or how much unemployment rates will rise during a recession. If black swans are important in part because of their consequences, and superforecasters can predict some of those consequences, then forecasting itself is a useful enterprise.

Becoming “Antifragile” Requires Anticipating Consequences

Tetlock and Gardner present this emphasis on consequences as part of their rebuttal to Taleb’s criticism of formal forecasting. As a reminder, Taleb thinks forecasting is pointless because the only events worth predicting are, by his definition, unpredictable. As we’ve just seen Tetlock and Gardner disagree because the consequences of important events are possible to predict, even if the events themselves are not, and these consequences are important too.

Taleb would most likely agree that the consequences of black swan events are a big part of what makes them so important. His 2012 book, Antifragile, primarily focuses on the consequences of black swan events.

However, unlike Tetlock, Taleb isn’t all that concerned with predicting the likelihood of any given consequence. Instead, his goal is to be “antifragile,” which means to position himself in such a way that he’s protected from the consequences of negative black swans while still poised to benefit from positive black swan outcomes (for example, by making a fortune investing in Apple or Google when they first started out and weren’t guaranteed successes). In Taleb’s eyes, trying to predict which consequences are likely to occur is futile because prediction is so tricky to get right, and the most influential events are unpredictable anyway. It’s therefore a better use of time and energy to focus on strengthening our position to the extent that we can weather (or exploit) any and every consequence.

How Do We Prepare for the Unpredictable?

Taleb, Kahneman, and Tetlock all agree that trying to predict events more than a few years in the future is pointless because too much can change in that amount of time. But not anticipating future events is often not an option, even if those events are unlikely, due to their possible consequences.

For example, there is a network of underground bunkers beneath the White House designed to keep the American president, the president’s immediate family, and key cabinet members safe in an emergency (such as a terrorist attack). The most recent addition to this network cost at least $86 million, an enormously high price tag for something that, so far, hasn't been used. However, losing these key members of government in a terrorist attack would be catastrophic, so the preparation is arguably justified.

All of this preparation depends on context. Why isn’t there an $86 million dollar bunker beneath each state governor’s residence? Because the probability of disaster must be high enough to justify the cost of preparation (and the White House is a likelier target for an international terrorist attack than a governor’s mansion).

(Shortform note: In situations where there is a small risk of big disaster, people often choose to ignore the situation entirely instead of preparing for the worst. For example, Tetlock believes that although many people predicted the COVID-19 pandemic, governments still weren’t adequately prepared for a pandemic because it’s “too easy to tune out chronic low-probability risks.”)

Exercise: Weigh the Impact of Black Swan Events

Let’s test the theory that unpredictable events make a bigger impact than gradual change.

Chapter 12: The Future of Forecasting

It’s clear by now that forecasting is important, even if there is debate about the relative importance of predictable events. The authors believe that forecasting tournaments are particularly important because they provide opportunities for superforecasters to sharpen their skills and for researchers to test theories about what makes some forecasters more accurate than others.

But forecasting is not always about accuracy. In reality, forecasters (especially those in the public eye) may have other goals for their forecasts. If the goal is to provoke a person or group or to draw attention to a cause, being right is an afterthought. (Shortform note: We can see this in the case of doomsday predictions. Many of the people who predicted the end of the world on a particular date were religious leaders who were more concerned with attracting followers to their cause than with the accuracy of their predictions.)

Big and Little Questions

According to the authors, the field of forecasting is facing another challenge in addition to concerns about accuracy: Namely, the idea that the questions people really care about and need to answer are typically too big for a forecaster to even attempt. For example, a solid superforecaster can predict the likelihood that China will begin closing any of its hundreds of coal plants (which experts say could help the country meet their environmental goals), but they can’t answer the real question people are asking: “Will we be able to prevent the most devastating effects of climate change?” Even the best superforecaster can’t answer that.

This is a valid criticism—luckily, Tetlock and Gardner argue that there is a way around it. Like Fermi, we can break big questions like “Will things turn out okay?” into a host of smaller questions that superforecasters can answer. This is called Bayesian question clustering. The answers to these questions contribute a small piece of the overall answer. Cumulatively, the answers to those small questions can approximate an answer to the bigger question.

For example, if we ask enough questions about factors that could contribute to worsening climate change, we know that the more “yes” answers we get to the small questions (for example, whether sea levels will rise by more than one millimeter in the next year, or whether the United States government will invest more money in solar energy), the more likely the big question is also a “yes.”

(Shortform note: This technique may help to answer a common critique of forecasting: that it is an example of the “streetlight effect,” or the equivalent of looking for your lost keys under the streetlight—even if that’s not where you lost them—because that’s where the light is best. This is related to black swan thinking—whatever future events you can predict (metaphorically shine a light on) won’t matter because the only truly important events are, by definition, unpredictable. To see the utility of Bayesian question clustering, we can change the metaphor a bit: If a forecaster searches for multiple puzzle pieces under the streetlight as opposed to a single set of keys, they may find enough pieces to at least see the gist of the whole puzzle—even if half the pieces are still lost in the dark.)

Asking Good Questions

To get an accurate picture of the big question, the authors argue that we need to ask relevant small questions that avoid bias and cover every perspective of the big question. The best questions are those we look back on after the event and curse ourselves for not thinking of them sooner.

By definition, these kinds of questions are difficult to think up in advance. They are often driven by a central big idea and so don’t come naturally to fox-minded superforecasters. This is where the hedgehogs we encountered in Chapter 3 excel. The vague, ideologically-driven predictions made by hedgehog-style pundits aren’t solid forecasts, but they do raise excellent questions.

In other words, superforecasters are excellent at finding answers to questions but not at choosing the right questions to answer in the first place. For that, they need the complementary strengths of the hedgehogs. A symbiotic working relationship between foxes and hedgehogs would maximize the skills of both groups.

To Ask Good Questions, We Need Diverse Hedgehogs

In Superforecasting, Tetlock and Gardner describe the importance of hedgehog thinkers generating questions for foxes to solve. Elsewhere, Tetlock describes a related concern: the need for hedgehogs with diverse Big Ideas all contributing to the same conversation.

In 2014, Tetlock and a group of coauthors published a paper on the dangers of political homogeneity in the social sciences. The authors argue that the social sciences are less politically diverse than ever (with up to 66% of social science professors in the United States identifying as liberal and only 5-8% identifying as conservative). In their view, this lack of diversity presents a serious challenge to the validity of the field in part because it influences the questions that researchers think to ask in the first place (for example, liberal researchers typically don’t study the possibility of stereotype accuracy because rejecting racial and gender stereotypes is a liberal value).

By the same logic, it’s possible for those asking forecasting questions to fall into the same ideological traps. In order to generate the full range of useful forecasting questions, we need hedgehog thinkers from across the ideological spectrum, including liberals and conservatives as well as libertarians and moderates.

Forecasting and Evidence-Based Policy

Forecasting can play an important role in almost any field, but Tetlock and Gardner believe its biggest potential lies in evidence-based policymaking. In the current political climate, policy debates often result in both sides of an issue becoming further entrenched in their positions, no matter how much evidence there is against it. Ultimately, debates like these go nowhere.

For more productive conversations, the authors recommend adversarial collaboration, in which people on both sides of an argument agree to focus on a shared goal and to make specific, testable forecasts about the best way to reach that goal. This also requires both parties to agree to follow the evidence, even if it means abandoning their original position. Adversarial collaboration requires an assumption of good faith on both sides. This is a lot to ask of people accustomed to fierce debate, especially because it usually results in a blended answer where neither side is completely “right” or “wrong.”

Tetlock and Taleb’s Failed Adversarial Collaboration

Adversarial collaboration isn’t easy, even for experienced academics like Tetlock. In 2013, he and Nassim Taleb attempted a form of adversarial collaboration in which they co-wrote a paper on the difference between predicting events with a yes/no outcome versus predicting events with many possible outcomes. The resulting paper was never published in an academic journal (Tetlock says it was “stillborn”), and the experience of working together only worsened their professional relationship. In the following years, Taleb published his criticisms of Tetlock as a coauthor and a scientist and christened him “Phil the Rat.” He argues that the experience wasn’t actually “collaboration” of any kind because he wrote most of the paper himself. Tetlock, for his part, tweeted that he was “ashamed” to have worked with Taleb at all.

Why did Tetlock and Taleb’s attempt at adversarial collaboration fail? It’s hard to say for sure, but it’s possible the two authors overestimated how much their beliefs overlap. According to Daniel Kahneman, when two authors who agree on very little attempt adversarial collaboration, it’s best to write in two voices (and possibly even have an arbiter present) so that neither author feels pressured to accept the other’s interpretations of their joint research.