Companion Volume to Antifragility (forth) and The Black Swan (2007-2010,2nd Ed.) Technical and Academic Papers and Derivations Nassim Nicholas Taleb NYU-Poly 2012 Table of Contents Taleb, N.N. (f.), A Map and Simple Heuristic to Detect Taleb, N. N., Golstein, D. G., and Spitznagel, M.(2009), Fragility, Antifragility, and Model Error, under revision, "The Six Mistakes Executives Make in Risk Quantitative Finance Management", Harvard Business Review , October 2009 Taleb, N.N. (2011), The Future Has Thicker Tails than Taleb, N. N. (2004), Bleed or Blowup: What Does the Past: Model Error as Branching Counterfactuals Empirical Psychology Tell Us About the Preference For Negative Skewness?, Journal of Behavioral Finance, 5 Taleb, N.N.(2011), Why did the Crisis of 2008 Happen? withdrawn. New Political Economy Taleb, N. N. and Pilpel, A.(2010), “The Prediction of Action”, in (eds. T. O' Connor & C. Sandis) A Taleb, N. N. (2008), Errors, Robustness and the Fourth Companion to the Philosophy of Action (Wiley- Quadrant, International Journal of Forecasting Blackwell) Mandelbrot, B. and Taleb, N. N., Large But Finite Taleb, N. N. and Pilpel, A.(2004), On the Unfortunate Samples And Preasymptotics Problem of the Nonobservability of the Probability Taleb,N.N.(2010), Convexity, Robustness and Model Distribution Error inside the "Black Swan Domain" Taleb, N.N. and Blyth, M, 2011, The Black Swan of Taleb, N.N. and Tapiero, C.(2010), The Risk Cairo ,Foreign Affairs, 90, 3 Externalities of Too Big To Fail, Physica A Makridakis, S. and Taleb, N.N. (2009), "Decision making and planning under low levels of predictability", International Journal of Forecasting Taleb, N.N. (2008), Finiteness of Variance Is Irrelevant in the Practice of Quantitative Finance, Complexity, 14(2). Taleb, N.N. and Martin, G., 2012, The Illusion of Thin- Tails Under Aggregation (a Reply to Jack Treynor) Douady, R. and Taleb, N.N. 2011, Statistical Undecidablility, Preprint Taleb, N.N., Platonic Convergence and Central Limit Theorem (Note) Taleb, N.N., The fundamental problem of the 0th moment and the irrelevance of "naked probability" (Note) Taleb, N.N., Derivatives, Fractal Option Pricing (Note) Goldstein, D. G. and Taleb, N. N. (2007), We Don't Quite Know What We Are Talking About When We Talk About Volatility, Journal of Portfolio Management, Summer 2007 Goldstein, D. G. and Taleb N.N. (2010), Statistical Intuitions and Domains: The Telescope Test Taleb, N. N. (2007), Black Swans and the Domains of Statistics, The American Statistician, August 2007, Vol. 61, No. 3 Derman, E. and Taleb, N. N. (2005), The Illusions of Dynamic Replication, Quantitative Finance, vol. 5, 4 Haug, E.G. and Taleb, N. N.(2010), Why Option Traders Have Never Used the Formula known as Black-Scholes- Merton Equation, forthcoming, Journal of Economic Behavior and Organizations 4 © Copyright 2008 by N. N. Taleb. A Map and Simple Heuristic to Detect Fragility, Antifragility, and Model Error N. N. Taleb New York University -Polytechnic Institute First Version, June 4, 2011 This Paper is to be presented at: JP Morgan, New York, June 16, 2011; CFM, Paris, June 17, 2011; GAIM Conference, Monaco, June 21, 2011; Max Planck Institute, BERLIN, Summer Institute on Bounded Rationality 2011 - Foundations of an Interdisciplinary Decision Theory- June 23, 2011; Eighth International Conference on Complex Systems - BOSTON, July 1, 2011. Abstract The main results are 1) definition of fragility, antifragility and model error (and biases) from missed nonlinearities and 2) detection of these using a single “fast-and-frugal”, model-free, probability free heuristic. We provide an expression of fragility and antifragility as negative or positive sensitivity to convexity effects, i.e., dispersion and volatility (a variant of negative or positive “vega”) across domains and show similarities to model errors coming from missing hidden convexities -model errors treated as left or right skewed random variables. Broadening and formalizing the methods of Dynamic Hedging, Taleb (1997), we present the effect of nonlinear transformation (convex, concave, mixed) of a random variable with applications ranging from exposure to error, tail events, the fragility of porcelain cups, deficits and large firms and the antifragility of trial-and-error and evolution. The heuristic lends itself to immediate implementation, and uncovers hidden risks related to company size, forecasting problems, and bank tail exposures (it explains the forecasting biases). While simple, it vastly outperforms stress testing and other such methods such as Value-at-Risk. Introduction: Main practical result of this paper: a risk heuristic that "works" in detecting fragility even if we use the wrong model/pricing method/probabil- ity distribution. The main idea is that a wrong ruler will not measure the height of a child; but it can certainly tell you if he is growing. Since (as we will see) risks in the tails map to nonlinearities (concavity of exposure), second order effects reveal fragility, particularly in the tails (revealed through perturbation)where they map to large tail exposures. Further, the misspecification in using thin-tailed distributions (say the Gaussian) shows immediately through perturbations of standard deviation when it appears to be unstable. Further here are results that shows how fat-tailed (powerlaw tail) probability distribution can be expressed by simple perturbation and mixing of the Gaussian. Why the same heuristic (detection of convexity effects) can measure both fragility and model error: Where F is a valuation “model”, Model (or Valuation) Error = E + E +E 1 2 3 where we asume that the three types of errors are orthogonal hence additive. E = linear error, the “slope”, an error about the first derivative of the model with respect to a variable (equivalent of the delta for an option), say 1 a (cid:61) F(cid:64)x(cid:43)(cid:68)x(cid:68)(cid:45)F(cid:64)x(cid:68). The model identifies the parameter a, but has a wrong value for such parameter in, say, a regression. One can safely believe (cid:68)x that modelers cannot easily make such error (the results of the mistracking will be immediately visible). E = missing a stochastic variable determining F. We unfortunately do not deal with that in this paper, but have evidence (Makridakis et al, 2 1982; Makridakis and Hibon, 2000) that, if anything, models by overly insample fitting, include too many variables, not too few. E (Procrustean Bed)= missing convexity effects, the “hidden gamma”, that is, a) missing the stochastic character of a variable deemed 3 deterministic (and fixed) and b) F is convex or concave with respect of such variable. The resulting bias causes misestimation of F , with undervaluation or overvaluation that maps to the nonlinearity. Such error being rare (and compounded by those rare large deviations), it is likely to be missed. Example of E . A government estimates unemployment for the next three years as averaging 9%; it uses its econometric models to issue a 3 forecast balance B of 200 billion deficit in the local currency. But it misses (like almost everything in economics) that unemployment is a stochastic variable. Employment over 3 years periods has fluctuated by 1% on average. We can calculate the effect of the error with the following: (cid:232) Unemployment at 8% , Balance B(8%) = -75 bn (improvement of 125bn) (cid:232) Unemployment at 9%, Balance B(9%)= -200 bn 2 Heuristic.nb (cid:232) Unemployment at 10%, Balance B(10%)= --550 bn (worsening of 350bn) So E is the convexity bias from underestimation of the deficit is by -112.5bn, since B(cid:72)8(cid:37)(cid:76)(cid:43)B(cid:72)10(cid:37)(cid:76)= -312.5 3 2 Further look at the probability distribution caused by the missed variable (assuming to simplify deficit is Gaussian with a Mean Deviation of 1% ) Missedconvexityeffect Missedfragility(cid:72)unseenlefttail(cid:76) Deficit (cid:45)1000 (cid:45)800 (cid:45)600 (cid:45)400 (cid:45)200 0 Figure 1CONVEXITY EFFECTS ALLOW THE DETECTION OF BOTH MODEL BIAS AND FRAGILITY. Illustration of the example; histogram from Monte Carlo simulation of government deficit as a left-tailed random variable simply as a result of randomizing unemployment of which it is a convex function. The method of point estimate would assume a Dirac stick at -200, thus underestimating both the expected deficit (-312) and the skewness (i.e., fragility) of it. Most significant (and preventable) model errors, as we will see, arise from E . 3 Now this paper will focus on a heuristic that can both detect Fragility and E since our definition of fragility is grounded in nonlinearities. 3 Further, the “fat tailedeness” of probability distributions is a straight application of E , the missing of a convexity effect. 3 Nonlinearity and Fragility: Simply, for the fragile, shocks bring higher and higher harm as their intensity increases (up to the point of breaking). Another example. For a collision, forty miles per hour causes more than four times the harm of ten miles per hour. Jumping from a level 30 feet high is more harmful than jumping 10 times from 3 feet. Every payoff one can think of in nature is nonlinear, hence subjected to some tail payoff, and some asymmetry in its distribution. And every model has some kind of Procrustean bed-style sucker problem coming with it, some error from missing the stochasticity of some variable and the nonlinear character of the payoff. The object here is to detect fragility (and, by the same process, to detect its opposite, antifragility, ability to gain from disorder). The same method that detects fragility can detect convexity biases, or model error stemming from missing the stochasticity of a variable, as well as sensitivity to the use of the wrong probability distribution. Our steps are as follows: a. We define fragility, robustness and antifragility. b. We presents the problem of measuring tail risks and show the presence of severe biases attending the estimation of small probability and its nonlinearity (convexity) to parametric (and other) perturbations . c. We express the concept of model fragility in terms of left tail exposure, and show correspondence to the concavity of the payoff from a random variable. d. Finally, we present our simple heuristic to detect the possibility of both fragility and model error across a broad range of probabilistic estimations. The central Table 1 introduces the exhaustive map of possible outcomes, with 4 exhaustive mutually exclusive categories of payoffs. The end product is f(x), which can be reduced to a scalar, and is the central variable of concern. We consider both the probability distribution of f(x) the payoff function, a "derivative" function of x, x being a "primitive" random variable, and the functional properties (concave, convex, linear). We present a series of arguments that can be proved (owing to the format of the discussion, some idiot-savant “quants” might not recognize the proof, so try a bit harder to adapt to the language). Note about the lack of symmetry between fragility and antifragility. By shrinking the left tail (in the presence of unbounded positive payoffs) you cause antifragility; but by increasing the right tail you don’t reduce fragility. Definition and Map of Fragility, Robustness, and Antifragility Table 1- Introduces the Exhaustive Taxonomy of all Possible Payoffs y=f(x) Heuristic.nb 3 Type Condition LeftTail RightTail Nonlinear Derivatives Effectof EffectofFattails (cid:72)lossdomain(cid:76) (cid:72)gainsdomain(cid:76) Payoff Equivalent Jensen's inDistribution Function (cid:72)Taleb,1997(cid:76) Inequalityon ofprimitivex y(cid:61) f(cid:72)x(cid:76) Missed "derivative Nonlinearities ", wherexisa random variable Type1 Fragile(cid:72)type1(cid:76) Fat Thin Concave Shortgamma Lowerexpectation Worsens Type2 Fragile(cid:72)type2(cid:76) Fat Fat Mixedconcave Longup(cid:45)gamma, Invariant, Worsensif (cid:72)regularor left, shortdown(cid:45) orLower absorbing absorbing convexright gamma expectation barrier, barrier(cid:76) (cid:72)fence(cid:76) incase neutralotherwise of absorbingbarrier Type3 Robust Thin Thin Mixedconvexleft, Shortdown(cid:45) Invariant Invariant concaveright gamma, (cid:72)digital, longup(cid:45)gamma sigmoid(cid:76) Type4 Antifragile Thin Fat Convex Longgamma Raises Improves (cid:72)Thickerthanleft(cid:76) expectation (cid:72)particularlyin Type4b where trigger barriers cause ratchet(cid:45) like properties(cid:76) Definition of Fragility Fragility (cid:149) Left Tail (cid:146) Concavity (Table 1) Fragility is defined as equating with sensitivity of left tail shortfall (non conditioned by probability) to increase in disturbance over a certain threshold K Examples a. Example: a porcelain coffee cup subjected to random daily stressors from use. b. Example: tail distribution in the function of the arrival time of an aircraft. c. Example: hidden risks of famine to a population subjected to monoculture. d. Example: hidden tail exposures to budget deficits’ nonlinearities to unemployement. e. Example: hidden tail exposure from dependence on a source of energy, etc. (“squeezability argument”). 4 Heuristic.nb ChangesinValue ChangesinValue time time ChangesinValue time Figure 2 TYPE 1, Fragile variations through time —the horizontal axis shows time. This can apply to anything, a coffee cup, a health indicator, changes in wealth, your happiness, etc. We can see small (or no) benefits most of the time and occasional large adverse outcomes. Uncertainty can hit in a rather hard way. Notice that the loss can occur at any time and exceed the previous cumulative gains. Figure 3 TYPE 3, the Just Robust (but not antifragile)- It experiences small or no variations through time. Never large ones. Figure 4 TYPE 4, The antifragile system: uncertainty benefits a lot more than it hurts—the exact opposite of Figure 1. Left Tail and Measure of Fragility In short, fragility is negative exposure to left uncertainty as measured by some coefficient of dispersion (STD, MAD, etc.) . We will define it first and then link it to convexity. Definition 1a (standard and monomodal distributions): where y and z are random variables, exposure to y is said to be more “fragile” than exposure to z in tail K if, for a given K in the negative (undesirable) domain, V(cid:72)y, f,K,(cid:68)s(cid:76)(cid:62)V(cid:72)z,g,K,(cid:68)s(cid:76) (1) where f and g are the respective monomodal probability distributions for y and z, (cid:68)s (cid:68)s V(cid:72)y, f,K,(cid:68)s(cid:76)(cid:186)Ζ y, f,K, s (cid:43) (cid:45)Ζ y, f,K, s (cid:45) (2) 2 2 K Ζ(cid:72)y, f,K,s(cid:76)(cid:186)(cid:224) yf(cid:72)y(cid:76)(cid:226)y (3) (cid:45)(cid:165) s is a dispersion parameter “volatility” used by the probability distribution f and g, and (cid:68)s is a set variation, a finite perturbation. The discussion in the next section on convex-concave situations shows why we rely on a finite perturbation (cid:68)s instead of the infinitesimal mathematical derivative. Sources of Fragility: y is a function, that is, a derivative of some primitve x but let us not concern ourselves with x for now (we will look at it when we analyze convex transformations). For now we can say that the distribution of f is limited to source of variation x, and that fragilities from other sources are not taken into account here. For instance, s can be the standard deviation or mean deviation for finite moment distributions, or tail exponent for a powerlaw tailed one (tail exponents subsume mean deviations and have an inverse relationship to deviation parameter for tail exponent >1). For the rare cases of Ζ not existing, say when the outcome’s distribution is Cauchy, there is no need to go further as it can be deemed infinitely or, unconditionally fragile, regardless of the properties of the right tail. Fragility is K-specific. We are only concerned with adverse events below a certain prespecified level, the breaking point. Exposures A can be more fragile than exposure B for K=0, and much less fragile if K is, say, 4 mean deviations below 0. Option traders would recognize fragility as negative "vega", or negative exposure to volatility. The use of finite (cid:68) is to avoid situations as we will see of vega-neutrality coupled with short left tail. Applying the measure to the examples in Figures 1 through 3: Figure 1 has negative sensitivity to dispersion, Figure 2 is neutral, Figure 3 gains from volatility. Deal with payoff functions: y is a payoff function of another random variable, which might itself be symmetric and thin-tailed, but of concern is the distribution of y, which requires transformation. For instance a call price is a funtion of another random variable, the underlying security (which itself may be a function of another r.v.), but of concern is the distribution of the call. Heuristic.nb 5 Deal with payoff functions: y is a payoff function of another random variable, which might itself be symmetric and thin-tailed, but of concern is the distribution of y, which requires transformation. For instance a call price is a funtion of another random variable, the underlying security (which itself may be a function of another r.v.), but of concern is the distribution of the call. Effect of using the wrong distribution f: Comparing V(y, f, K, (cid:68)s) and the alternative distribution V(y, f*, K, (cid:68)s) , where f* is the “true” distribution, the measure of fragility is acceptable (“robust”) under the following conditions: a. that both distributions are monomodal (the condition of using V as a measure of fragility), or b. that the difference between the two, that is, the bias does not reverse in sign in the tails, or c. that the sign of higher differences (cid:68) (cid:185) 0 for all orders n do not carry opposite signs. n sgn(cid:72)(cid:68)n(cid:76)(cid:61)sgn(cid:72)(cid:68)n(cid:45)1(cid:76)foralln where (cid:68)s (cid:68)s (cid:68)s (cid:68)s (cid:68) (cid:61)(cid:58)V y, f, K, (cid:68)s(cid:45) (cid:45) V y, f, K, (cid:68)s(cid:43) (cid:62)(cid:45)(cid:58)V y, f(cid:42), K, (cid:68)s(cid:45) (cid:45) V y, f(cid:42), K, (cid:68)s(cid:43) (cid:62) 1 2 2 2 2 (cid:217)Kyf(cid:72)y(cid:76)(cid:226)y Unconditionality of the measure of shortfall Ζ: Many, when presenting shortfall, deal with the conditional shortfall (cid:45)(cid:165) ; while this (cid:217)K f(cid:72)y(cid:76)(cid:226)y (cid:45)(cid:165) measure might be useful in some circumstances, its sensitivity is not at all indicative of fragility in the sense used in this discussion. The unconditional tail expectation Ζ, (cid:217)K yf(cid:72)y(cid:76)(cid:226)y is more indicative of exposure to fragility. It is also preferred to the raw probability of falling (cid:45)(cid:165) below K, (cid:217)K f(cid:72)y(cid:76)(cid:226)y as the latter does not include the consequences. For instance, two such measures (cid:217)K f(cid:72)y(cid:76)(cid:226)x and (cid:217)K g(cid:72)y(cid:76)(cid:226)y can be equal (cid:45)(cid:165) (cid:45)(cid:165) (cid:45)(cid:165) over broad values of K; but the expectation (cid:217)K yf(cid:72)y(cid:76)(cid:226)y can be much more consequential as the cost of the break can be more severe and we (cid:45)(cid:165) are interested in its “vega” equivalent. Exception: the case of non monomodal and truncated distributions Another definition is necessary for non unimodal distribitions. Measures of dispersion (and higher order ones) do not work well in the presence of polarized mixture distributions. Nor do they do well with payoffs subjected to absorbing barriers. Definition 1b (for absorbing barriers, hence bimodal and multimodal distributions in probability space):where y and z are random variables, exposure to y is said to be more “fragile” than exposure to z below “tail” K if, for a given K in the negative (undesirable) domain, the costs of hitting a barrier L below K is higher for y than z. So Definition 1b can be made similar as definition 1a, except that we would have to perturbate, V(y,f,K,(cid:68)p) where p is the parameter setting the distance of lower mode (lower) away from the mean (for mixed distributions, the lower mean; for barriers, the cost of hitting the barrier). Consider the stochastic process S , S >L, with absorbing barrier L from below, so with probability one it should break, but with some distribu- t 0 tion of stopping time. Further, there is a “cost” attached to breaking. For a coffee-cup, a silk tie, a computer, a mirror, a corporation, the value can be assumed to go to close to 0 (0 plus some minor residual). The idea also applies to debt and squeezes ( death, famine, brankrupcies, etc. are absorbing barriers). Knock In: Antifragility would be a trigger-barrier causing ratchet-like properties (biological systems with irreversibilities), for the process S , t S < H, with an inverse “cost”, like a benefit upon hitting the barrier. 0 140 120 100 80 60 40 20 20 40 60 80 100 Figure 5 : L=80, S0 = 100. Absorbing barrier (down and out) causes a special brand of left tails, when the barrier causes a collapse to a certain value (here, 0, with no residual) 6 Heuristic.nb The next graphs show the results of hitting a barrier in probability space, with standard bimodal, double-peaks. Pr ValueofCup (cid:45)25 (cid:45)20 (cid:45)15 (cid:45)10 (cid:45)5 5 Pr ValueofCup (cid:45)25 (cid:45)20 (cid:45)15 (cid:45)10 (cid:45)5 5 Figures 6 and Figure 7 : Bimodal Distributions (two gaussians with different means). The comparative fragility of two coffee cups, with their states as two Diracs (breaks or doesn't break). Each breaks at a given level. They both have the same probability of breaking, but a different Ζ. The distribution on the left, although patently more fragile, does not respond to changes in STD They are both invariant to changes in dispersion parameter, yet the one on the left is more fragile. Nor does kurtosis seem to matter. As Figure 6 shows, the fragile is not necessarily higher on the measure of kurtosis. Figure 8 : A mixed distribution with two Gaussians of different means and STD (actually, the stick on the right approaches a Dirac). The most likely position is either in the “stick” or in the “breaking” section to the right as there is no mass in between. Although it appears extremely skewed (hence fragile), the Kurtosis is lower than that of a Gaussian. The distribution in Figure 7 is vastly more common than accepted (bond returns, loans, stock mergers, etc.) So the presence of a right tail does not matter: Jensen’s inequality will lower expectations. Next, because f and g can be misspecified probability distributions (i.e., further from the “true” f and g): Adding Model Error and Metadistributions: Model error should be integrated in the distribution as a stochasticization of parameters. We will see that f and g should subsume the distribution of all possible factors affecting the final outcome (including the metadistribution of each). The so-called "perturbation" is not necessarily a change in the parameter, so much as it is a means to verify if f and g capture the full shape of the final probability distribution. Note that, something with a bounded payoff, and a function that organically truncates the left tail at K will be impervious to all perturbations affecting the probability distribution below K.
Description: