Tuesday 12 November 2013

Lessons From the Past

I must confess that until a few time ago I didn't know the extinction of dinosaurs is not the only or the most severe of the extinction events our Earth has experienced. Citing from Wikipedia,
"the Cretaceous–Paleogene (K–Pg) extinction event [..] was a mass extinction of some three-quarters of plant and animal species on Earth—including all non-avian dinosaurs—that occurred over a geologically short period of time 66 million years ago".
This certainly sounds quite bad, but in fact not as bad as the so-called Great Dying,
the "Permian–Triassic (P–Tr) extinction event [..] that occurred 252.28 million years ago. [..] It is the Earth's most severe known extinction event, with up to 96% of all marine species and 70% of terrestrial vertebrate species becoming extinct. It is the only known mass extinction of insects. Some 57% of all families and 83% of all genera became extinct."
Thus some 252 million years ago a chain of events produced a catastrophe that affected so deeply the terrestrial ecosystem that it is conjectured "it took some 10 million years for Earth to recover" from it. Nevertheless, the Earth ultimately did recover from it, which led to so big a change in natural history that scientists had to clearly separate what was before from what followed, the Paleozoic ("Old Life") from the Mesozoic (the "Middle Life"). Among the many important questions that raise when considering so catastrophic an event, some that I feel are particularly relevant here are:
  • Q1: Was there any "common reasons" behind the P–Tr extinction event? In other words—were there "common triggers" causing such a widespread correlated failure?
  • Q2: What was the key ingredient—the key defensive strategies that is—that made it possible for the Earth to survive in spite of so harsh a blow?
Now in order to attempt an answer to the above question I recall the following facts:
  • F1: "Mineralized skeletons confer protection against predators" [Knoll]
  • F2: "Skeleton formation requires more than the ability to precipitate minerals; precipitation must be carried out in a controlled fashion in specific biological environments" [Knoll]
  • F3: "The extinction primarily affected organisms with calcium carbonate skeletons, especially those reliant on ambient CO2 levels to produce their skeletons" [Wikipedia].
In other words, one of nature's many independent evolutionary paths was particularly successful (F1) and thus become widespread; regrettably, the adoption of the solution implies a strong dependence on predefined and stable environmental conditions (F2); and, finally, a correlation exists between the class of species that adopted the solution and that of the species that were affected most by the P–Tr extinction event (F3).

If we read the above with the lingo of computer dependability and resilience we could say that:

  • A given solution became widespread (for instance a memory technology, a software library, a programming language, an operating system, or a search engine).
  • The solution introduced a weakness: for instance, a dependence on a hidden assumption, or a "bug" depending on certain subtle and very rare environmental conditions.
  • This translated in a common trigger, a single-point-of-multiple-failures: one or a few events "turned on" the weakness and hit hard on all the systems that made use of the solution.
A good example of this phenomenon is probably given by the so-called Millennium Bug.

What can we conclude from the above facts and analogies? That solutions that work well in the "common case" are those that become more widespread. Regrettably this decreases disparity, namely inter-species diversity. Species that externally appear considerably different from each other in fact share a common trait -- a common design template. This means that whenever the "common case" is replaced by the very rare and very bad "Black Swan", a large portion of the ecosystem is jeopardized. In fact the rarest the exceptional condition, the more widespread is the template and the larger the share of species that will be affected. This provides some elements towards an answer for question Q1: yes, there were common triggers that ultimately produced the P–Tr extinction event by increasing the diffusion of the same "recipes" thus paving the way to large amounts of correlated failures. On the other hand, the Earth did survive the Great Dying and other extinction events. Why? My guess for an answer to Q2 is that Nature introduces systemic thresholds that make sure that disparity never goes beyond some minimum. The key ingredient to guarantee this is diversity: it is not by chance that mutation is an intrinsic method in genetic evolution. Mutation and possibly other mechanisms make sure that, at any point in time, not all of the species share the same design templates. In turn this guarantees that, at any point in time, not all the species share the same fate.

Interestingly enough, similar solutions are sought also when designing computer systems. In order to decrease the chance of correlated failures multiple diverse replicas are executed in parallel or one after the other. It's called design diversity and it's often based on design templates such as N-version programming or Recovery Blocks.

(It is worth remarking how the adoption of the design diversity templates also decreases disparity... yes, it's a never ending story.)

The major lesson we need to learn from all this is that diversity is an essential ingredient of resilience. Bring down diversity and you decrease the chance the ecosystem will be able to withstand the Black Swan when it will show up. (And, given enough time, rest assured it will show up). High diversity means that a large number of systems will be put to test with new conditions when the Big One strikes. Even when most of the system-environment fits will decree extinction (or system failure), still a few systems, by chance so to say, will have the right elements to pass through the sieves of the Black Swan with limited damage. And it's those limited few that are going to inherit the Earth.

Creative Commons License
Lessons from the Past by Vincenzo De Florio is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
Permissions beyond the scope of this license may be available at http://win.uantwerpen.be/~vincenz/.

1 comment:

  1. Re: the Millennium Bug, here's some text from my book "Application-layer Fault-tolerance Protocols":

    "Most of the software still in use today was developed using a standard where dates are coded in a 6-digit format. According to this standard, two digits were considered as enough to represent the year. Unfortunately this translates into the impossibility to distinguish, e.g., year 2000 from year 1900, which by the en of last century was recognized as the possible cause of an unpredictably large number of failures when calculating time elapsed between two calendar dates, as for instance year 1900 was not a leap year while year 2000 is. Choosing the above mentioned standard to represent dates resulted in a hidden, almost forgotten design fault, never considered nor tested by application programmers. As society got closer and closer to the year 2000, the possible presence of this design fault in our software became a nightmare that seemed to jeopardize all those crucial functions of our society today appointed to programs manipulating calendar dates, such us utilities, transportation, health care, communication, public administration, and so forth. Luckily the expected many and possibly crucial system failures due to this one application-level fault were not so many and not that crucial [..]"

    ReplyDelete