This article was submitted to Geohazards and Georisks, a section of the journal Frontiers in Earth Science

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

This paper presents a generalization of the bias-variance tradeoff applied to the recent trend toward natural multi-hazard risk assessment. The bias-variance dilemma, a well-known machine learning theory, is presented in the context of natural hazard modeling. It is then argued that the bias-variance statistical concept can provide an analytical framework for the necessity to direct efforts toward systemic risk assessment using multi-hazard catastrophe modeling and inform future mitigation practices.

Several examples of severe compounded impact from complex natural hazards interplay have occurred over the last decades. A few examples over such a limited timeline are the 2011 Tohoku earthquake and tsunami in Japan (

Many current risk studies still consider hazard in isolation, where the risk to exposed elements is sourced from one hazard in isolation also referred to as “single hazard” models (

The concept originated from

Representation of the bias-variance tradeoff from a risk perspective.

In

formally expressed as:

Varying complexity of a stochastic model and data availability will translate in a tradeoff between minimizing bias and minimizing variance. In the context of artificial intelligence and machine learning this effect is referred to as underfitting when bias is too high or overfitting when variance is too high (

The recent push to develop a more holistic approach to natural hazard assessment (multi-hazard) can be informed and, as the following paragraphs will try to demonstrate, supported by a generalization of the bias-variance concept. Despite the primary focus of this concept on machine learning related topics, the bias-variance tradeoff has been generalized outside its core contribution before, notably in scientific publication, to inform debates on topic such as education (

Several solutions have been proposed to decrease either bias or variance in order to improve model predictability. As a biased model is inclined towards some particular feature to predict the result (e.g., single hazard model in a multi-hazard environment are highly biased and high/low variance depending on the data availability), the remediation pass by the use of a broader scoping and an increase in complexity of the model (e.g., multi-hazard considered). On the other hand, high variance models are too broad in scope and require a more constrained model by, generally, the addition of more data (

Representation of the bias-variance tradeoff in a multi-hazard context.

In the attempt to prepare for natural disasters, scientists attempt to simplify the problem at hand by providing explanation in terms of ever smaller entities. This process is also known as reductionism (

Low variance models, even though a solid option in a data rich environment, might prove to be difficult to access in the context of natural hazards. Natural hazard occurrences follow universally a power law linking probability of exceedance to intensity of phenomena (

The explanation from the previous paragraphs gives a hint that, in the near term, it would be easier to achieve progress through more accurate but less consistent (low bias-high variance) models driven by complexity and system thinking than consistent models with less accuracy (high bias-low variance) where progress is data-driven.

Lower bias methods can provide interesting progress in coming risk studies, but limitations will also appear. As the variance and complexity of the model increases, the risk of overfitting the data escalates too and the prediction of the model becomes less effective (

Not all landscape and environments are sensitive to multi-hazard in the same way. A bias-variance perspective can help to identify multi-hazard prone environment. From a bias-variance perspective, if the lower bias risk model gives the same results as the higher bias risk model, then we are probably not facing a multi-hazard problem (or the parameters influencing the risk have not been properly mapped out).

The bias-network dilemma can also inform the action and preparedness to disasters. In a multi-hazard environment, single hazard model would not only strongly bias the risk assessment but, more dramatically, bias the action and response to such a model. As the risk assessment evolve toward complex risk assessments (lower bias hazard model), the solution space widens also as more scenarios need to be accounted for (lower bias “solution model”). Indeed, multi-hazard models can help generate “knowledge ensembles” (

It is also plausible that under a richer risk modeling framework-using a wider hazard spectrum-mitigation to unplanned disaster happens as a serendipitous consequence. To illustrate this point, let’s imagine an example based on the 2008 Wenchuan earthquake in China which caused cascading effects, one of the most significant being the “Tangjiashan landslide dam, which was triggered by the Ms = 8.0 Wenchuan earthquake in 2008 in China (which) threatened 1.2 million people downstream of the dam. All people in Beichuan Town 3.5 km downstream of the dam and 197,000 people in Mianyang City 85 km downstream of the dam were evacuated 10 days before the breaching of the dam.” (

The paper present an analytical argument to the state and future of natural hazard modeling from a generalization of the bias-variance tradeoff concept. A look back at the work done on machine learning and previous papers on the generalization of the concept provide an additional case to existing positions for multi-hazard modeling as a standard to natural hazard risk assessment. The paper points out that achieving lower variance and data-driven related improvement might prove difficult because of the power law distribution of natural hazards. On the other hand, the implementation of lower bias multi-hazard models are relatively new to risk modeling and could still be a low-hanging fruit for rapid and significant improvement. An added, and concealed, advantage of a systemic, and multi-hazard models (low bias) to risk assessment is the possible emergence of new (synergistic) resilient solution outside existing frames of reference (high bias) and, also, positive serendipitous mitigations. A caveat still exists as excessively complex modeling will impair the predictability of multi-hazard risk models, hence the complexity of risk models should be studied further.

The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.

The author confirms being the sole contributor of this work and has approved it for publication.

GNS Science SSIF funding.

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.