The Envelope, Probability Theory, and the Black Swan
The Envelope
Test pilots have a term that describes the known operating
characteristics of an airplane: “The Envelope”. The Envelope is the range of known, safe,
operating parameters, such as top speed, stall speed, angle of attack, maximum
bank, etc. When the test pilot tries to
extend the range of operating parameters, he is operating “outside of the
Envelope”, and the airplane may perform in unexpected or dangerous ways. There are abrupt discontinuities in the
physics of flight, and test pilots’ reputation for courage is
well-deserved.
The concept of “The Envelope” applies to our total
experience, as individuals, as businesses, as nations, or as a species. When we are within the envelope of our
experience, events unfold more or less as we encountered them before. When we are outside the envelope of our
experience, our theories about probability break down. We are in a place where established rules do
not apply, and where our experience is irrelevant, or worse, misleading. We are in the realm of the Black Swan. We will return to the Black Swan, but first
we need to step back and consider the history of probability.
---------
Probability Theory
Our lives are ruled by probability. Will the stock market rise or fall? Will tomorrow be sunny or rainy? Will I get a raise at work? Will the Dodgers win the pennant? Will I have an automobile accident? Will I win the lottery? Will I be late to the airport? Will she get pregnant? Will l die of cancer? From the mundane to the life-shattering,
every day is filled with uncertainty about events, and throughout our lives, we
develop ways to understand those probabilities and to anticipate the results.
A Priori Knowledge
and Discrete Outcomes
Gamblers were the first to take an interest in probability
theory, for obvious reasons. In the 16th
and 17th centuries, gamblers and mathematicians (often being one in
the same) began to develop the theory of probability in the simplest cases. They learned how to calculate the odds for
any event with a set of known a priori
parameters and discrete outcomes. A priori (“from earlier”) means
that we inspected the dice before the throw, or counted the cards before the
deal. Discrete outcomes are critical; when we throw dice we do not expect the outcome to be eight and three-quarters, or the king of spades. If we consider all of the possible discrete outcomes, we
can calculate precisely the odds of winning or losing (assuming that the dealer
isn’t cheating).
Non-A Priori Cases
But what about situations without a priori knowledge?
In the eighteenth century, mathematicians considered the
problem of a bag filled with an unknown number of white balls and black
balls. We are not given the opportunity
to inspect the bag, or count the balls.
The probability of drawing either white or black can only be discovered
through experience. As our experience
grows, we gain confidence about the ratio of black to white balls in the bag,
although we can never be completely certain of the total probability until
we have drawn the final ball from the bag.
Modern sampling theory can quantify the degree of confidence in the
probability as a function of how many balls we have drawn (i.e. the extent of
our experience).
--------
Bayes’ Theorem
Bayes’ Theorem, developed in the eighteenth century, applies
to probability problems without knowledge
of a priori conditions. Bayes’
theorem is used to combine a subjective estimate of probability with previous experience
about the actual rate of occurrence.
Suppose a bird watcher claims to have seen a rare species of
duck. Previous experience shows that in
nature, the common bird is observed 99% of the time; the rare bird 1% of the
time. And suppose our experience also
shows that amateur bird watchers accurately identify birds 90% of the time, and
mis-identify birds 10% of the time.
Bayes’ theorem gives the probability that the bird watcher
actually saw the rare species as only about 8%. (Bayesian calculator: http://psych.fullerton.edu/mbirnbaum/bayes/BayesCalc.htm)
Consider the case of the Ivory-Billed Woodpecker, which is
generally regarded as extinct since the last confirmed sighting in 1944. In 2004, scientists from Cornell University
claimed a sighting of the bird, creating a swarm of media interest. Here is a blurry 2-second video which
documents their sighting:
Is this the long-missing Ivory-Billed Woodpecker, or the
similar, but relatively common Pileated Woodpecker? What does Bayes’ Theorem say?
The actual population of the Ivory-Billed (assuming it
exists) must be very low, compared to the Pileated Woodpecker. Let’s assume there may be a total population
of 10 Ivory-Billed Woodpeckers, and about 10,000 Pileated Woodpeckers. Further, let’s assume that our scientists,
paddling a canoe through a swamp, can identify a flying bird correctly 85% of
the time. Then we can do the math: The probability that the observed
bird was actually the rare Ivory-Billed Woodpecker is only 5 out of 1000.
This result brings to mind a rule of scientific analysis
from Cornell’s most famous scientist, Carl Sagan: “Extraordinary claims require
extraordinary evidence.” Bayes’
Theorem is the mathematical expression of that principle.
One application of Bayes theorem is to calculate the true
probability of an event, using subjective guesses about an event, and the
actual rate of occurrence. The method
is essentially a “force-fit” of subjective guesses to the actual rate of
occurrence, using the historical accuracy of previous guesses.
We can extend the Bayesian method to subjective estimates of
success used by petroleum geologists in prospect appraisal.
In oil exploration, geologists assign a probability of commercial success to each prospect. Over time, the cumulative experience of success and failure provides a means to review the accuracy of the predictions, provided that the predictions were made using the same methodology.
The following chart shows 74 exploration prospects drilled between 1996 and 2000. The probability of success is the vertical axis, and prospects are shown in rank order, according to geological chance of success. Successes are color coded in red, and failures in blue. The chart demonstrates that geologists' estimates have merit; successful prospects generally occur on the left side of the chart. But let's take a closer look. Are the estimates quantitatively correct? Can the estimates be improved with the Bayesian technique?
2 The second chart shows the prospect portfolio, roughly in thirds according to risk. The upper third (highest chance of success) slightly outperformed the estimates, with 57% actual success, compared to 48% estimated success. The middle group moderately underperformed the estimates, with 17% actual success compared to 28% in estimates. And the bottom third substantially underperformed the estimates, with only 7% success, compared to 18% predicted success.
We can look at the performance of the entire portfolio by summing the estimated probability of success, to create a curve showing the cumulative predicted number of discoveries across the portfolio (blue curve). We can compare the actual cumulative discoveries (red curve), which rises by integer steps over the successful prospects. Actual results closely parallel the predictions to the midpoint (about 30% chance of success). Actual results then trail the predictions to the bottom third (about 20% chance of success), where the actual results go flat, showing no success corresponding to prospects estimated at less than 20% chance of success.
Using Excel, we can run a regression on the curve representing the cumulative actual discoveries, relating the relationship between the rate of predicted to actual discoveries. This function adjusts the estimated probabilities to actual results, and provides a predictive means to forecast future probabilities.
The same function can be applied in a predictive fashion to a new portfolio of prospects. The second group shares characteristics of the first group of prospects. Success is concentrated in the lower-risk part of the portfolio. Actual success is greater than predicted in the low-risk part of the spectrum, and success is almost absent in the higher-risk part of the spectrum. The function derived from the first prospect group is not a perfect fit, but improves the fit of pre-drill estimates to actual results. The cumulative success for the program using the Baysian-adjusted probability of success very close to the actual success of the program.
Tracking predictions and results, combined with Bayesian methods allow the calculation of true probabilites from subjective estimates. The method can be used iteratively to adapt to changes in exploration technology or improved estimates by the geological staff. The process provides a way to obtain quantitatively better risk-adjusted investment decisions, and to concentrate attention on prospects most likely to yield success.
-------
Estimation of Low-Probability Events
People are very good at estimating probabilities in the middle range. We fairly accurately assess the flip of a coin, the outcome of a college football game, or even the chance of a full house in five-card poker (14.4%). But people are terrible at estimating chances on the ends of the probability spectrum. We are usually unable to discern the difference between the probability of events at 1:100, and 1:1000, or even 1:10,000 chances. The same phenomenon occurs at the other end of the spectrum, for events of very high likelihood, but less than certainty. We are simply unable to sense or quantify the difference.
An example of this problem is shown in the risk assessments for the Space Shuttle program. Richard Feynman wrote a stinging critique of the NASA risk estimates following the Challenger disaster.
(Engineers at Rocketdyne, the manufacturer, estimate the total probability
[of catastrophic failure]as 1/10,000. Engineers at marshal estimate it as 1/300, while NASA management, to whom these engineers report, claims it is 1/100,000. An independent engineer consulting for NASA thought 1 or 2 per 100 a reasonable estimate.)
The actual rate of failure was 2 disasters out of 135 missions, or about 1/67.
How is it possible that such wildly differing estimates existed, regarding the safety of such an important project?
Part of the problem is sampling. For rare events, we must make a large number of observations to detect and quantify a possibility. If we walk across a lake on thin ice ten times and do not fall through the ice, we can conclude that the chance of falling through the ice is probably less than 1:10. It does not mean that walking on thin ice is safe, or that we can safely cross the lake 100 times. For rare events, we simply cannot gain enough experience to adequate grasp the true probability. This is particularly problematic for risky and dangerous events.
There are many other facets to the problem, including self-interest of management, and various other sources of bias which Feynman discusses in his report.
But the simple lesson that I take away is that people simply cannot understand risk in the range of low probability events.
------------
Excuse me, I will finish this post soon!
Black Swan Events
Author Nicholas Taleb introduced the term Black Swan into the modern vocabulary of risk in his book "Fooled by Randomness", in 2004.