^{1}

^{2}

^{*}

^{1}

^{2}

Edited by: Taishin Nomura, Osaka University, Japan

Reviewed by: Mitsuyuki Nakao, Tohoku University, Japan; Yuichi Togashi, Hiroshima University, Japan

This article was submitted to Systems Biology, a section of the journal Frontiers in Applied Mathematics and Statistics

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

System theory has its roots in mathematical formalisms developed by mathematicians and physicists, such as Leibniz, Euler, and Newton, and applied by congenial chemists and biologists such as Lotka and Bertalanffy. In these approaches, the dynamical system—may it be either single organisms or populations of organisms in their ecosystems—is defined and formally translated into an interaction matrix and first-order ordinary differential equations (ODEs) which are then solved. This provides the background for the quantitative analysis of any linear to non-linear system. In his inspiring article “Can a biologist fix a radio?,” Lazebnik made the differences very clear between a “guilt by association” hypothesis of a modern biologist vs. a Signal–Input–Output (SIO) model of an electrical engineer. The drawback of this “Gedankenexperiment” is that two rather different approaches are compared—a forward model predictive control approach in the case of the SIO model by an engineer and an inverse or reverse approach by the biologist or ecologist. Biological and ecological systems are much too complex to estimate all the underlying ODE's, parameter and input signals that generate a probability distribution. Thus, the combination of inverse data-driven modeling and stochastic simulation is a key process for understanding the control of a biological or ecological system. The challenge of the next decades of systems biology is to link these approaches more systematically. Over the last years, we have developed a hybrid modeling approach based on the stochastic Lyapunov matrix equation for the analysis of genome-scale molecular data. This workflow connects forward and inverse strategies such as the genome-scale-based metabolic reconstruction of an organism and the calculation of dynamics around a quasi-steady state using statistical features of large-scale multiomics data. Ultimately, this workflow is linked to physiology and phenotype (the output) to unambiguously define the genotype–environment–phenotype relationship. This system-theoretical formalism establishes the generic analysis of the genotype–environment–phenotype relationship to finally result in predictability of organismal function in the environmental context. The approach is based on fundamental mathematical control theory for the analysis of dynamical systems using eigenvalues and matrix algebra, stochastic differential equations (SDEs), and Langevin- and Fokker–Planck-type equations eventually leading to the continuous stochastic Lyapunov matrix equation. The stochastic Lyapunov matrix equation is also a fundamental approach for the analysis and control of artificial intelligence systems in model predictive control and thus opens up completely new perspectives for the integration of systems engineering and systems biology. Furthermore, similar mathematical formalisms—using a community matrix instead of a stoichiometric matrix of a metabolic network—were also conceptually developed and applied by ecologists such as Levins and May in the analysis of stability and complexity of model ecosystems. Thus, the generalization of this hybrid forward–inverse approach spans from biology to ecology and promises to be a systematic iterative process that finally leads to functional units able to explain living systems up to their interaction in complex ecosystems.

Systems biology is a modern development in biology integrating genome-scale molecular analysis, e.g., metabolomics, proteomics, and transcriptomics, with computer-based mathematical and statistical modeling of metabolism and regulation. The aim is, on the one hand, to derive causal mechanisms from molecule to organism and, on the other, to establish quantitative models for the prediction of phenotypes from genotypes. Ultimately, systems biology aims for a universal genotype–environment–phenotype equation, especially in the era of genome sequencing. These ideas were already developed a long time ago in the work of Ludwig von Bertalanffy, e.g., in the book

Now, where are we standing, almost 80 years later, when we talk about an organism, microorganisms, plants, animals, or human? After the elucidation of the molecular principles of inheritance and information storage in the 1950s [

In the following, I will introduce techniques for genome-scale molecular analysis as well as mathematical and statistical hybrid modeling as fundamental requirements to fulfill this proposed cycle of understanding the dynamics of an organism. The comprehensive analysis of organisms, microbes, plants, fungi, animals, and human begins nowadays with genome sequencing using “next-generation sequencing” technologies [

A workflow which combines multiomics analysis [

Another very prominent example of functional annotation of genomes derives from genome-wide association studies (GWAS). Here, allelic polymorphisms in the genome of one species and its corresponding ecotypes or phenotypes are screened systematically to reveal their correlation with phenotypic and adaptive traits. In most cases, single-nucleotide polymorphisms (SNPs) are molecular neutral mutations; however, there are phenomena such as linkage disequilibrium and accumulation of SNPs in a specific genomic region pointing to potential adaptation processes in the genome [

The final and conclusive step to link genome-scale molecular analysis with genome sequence information is model building [

However, all these approaches rely on “speculated” forward conjectures about network structure and kinetic parameters. The problem is that all these data are not available, especially the detailed kinetics of protein interaction, enzymatic reactions, and biochemical regulation. We have recently introduced a novel idea by linking correlation networks of molecular components in an organism with the underlying biochemical regulation [

Derivation of metabolite correlation networks from large-scale metabolomic analysis. These correlation networks show differential connectivity in dependence of the analyzed genotype. From this fundamental multivariate structure several research fields are envisioned: Firstly, topology analysis of the correlation network resulted in a power-law-like pattern of the probability of metabolite connectivity [

The pioneering work of Bertalanffy and others still provides the basic principles necessary to analyze any complex biological and ecological system. These principles were developed in parallel and multidisciplinary, e.g., in the work of Ludwig v. Bertalanffy in biology in the 1930s, or in the work of Alfred Lotka in ecology in 1934, and later by the work of Norbert Wiener in cybernetics and many others. Parts of stable population theory were already introduced by Leonard Euler in 1760 as discussed in Lotka's seminal work [

From the mathematical point of view, the system under study is instantaneously defined by its state variables that can be measured [

A PCA plot showing the metabolomic trajectory of a diurnal cycle of _{1}-Cov_{6}). From all these different covariance matrices a corresponding Jacobian (J_{1}-J_{6}) can be calculated with the inverse stochastic Lyapunov matrix equation showing the biochemical perturbation and keypoints of control over a day-night cycle (for more details see text).

Irrespective of which system we are analyzing, the basic systems equations are derived always in exactly the same way. Bertalanffy, Lotka, and all others used first-order coupled ordinary differential equations (ODEs) as the mathematical formalism. A very intuitive and elegant description is found by Uwe an der Heiden [

Let us start with the coupled mass–spring system. Later on, this simple system consisting of one mass, one spring, and a damping coefficient can be extended to a highly complex system: imagine a chain or network of thousands or millions of coupled mass–spring systems and various damping terms [

A damped mass–spring system is described by a second-order differential equation:

with

Correspondingly, higher-order systems result in

One could think about networks of coupled mass–spring systems. Their interaction matrix is characterized by the incidence matrix [

In matrix notation, this is

or

respectively [

These systems are solved by inserting

and deriving the characteristic equation for the calculation of the eigenvalues λ_{1, 2, …, n} and constants _{1, 2, …., n} using

The eigenvalues of the matrix A in (5) are the roots of the characteristic equation [

with the eigenvector matrix

The solution of Equation (5) is according to Equations (7, 8) in matrix notation:

and corresponds to the matlab command [T,D] = eig(A) [

Linear systems (5) can be solved by (8) and (9) calculating the eigenvalues and eigenvectors. In case of nonlinear systems equations which is true for almost all molecular, organismal and ecological networks in the real world we have to introduce the Jacobian matrix. The Jacobian matrix is derived by linearizing a non-linear system at quasi steady states [

with the steady state

For _{0},

is small, a Taylor expansion leads to

Which is called the “Jacobian” [

For linear systems or close to the steady state, it holds

The solution to these systems is again given by the matrix eigenvalue/eigenvector Equation (10) [

Once we derive the Jacobian of a system of coupled first-order differential equations, we can calculate eigenvalues and vectors, which describe stability and system properties close to a steady state or quasi equilibrium. If we are able to derive the Jacobian from the covariance matrix of the data (see below), we can calculate the eigenvalues from these data-derived Jacobians to investigate their properties. For this we have developed the concept of the inverse stochastic Lyapunov matrix equation and we have recently tested this idea [

The variance–covariance matrix of the data _{Dcent}:

If we statistically analyze several measurements or empirical observations of the state variables over time, e.g., in a principal components analysis (PCA), then we derive the trajectory of different snapshots of transitory steady states of the system defined by the covariance matrix of its state variables [

To learn about the system from these data-defined trajectories [

Derivation of the stochastic Lyapunov Matrix Equation from fluctuating metabolic correlation networks [

Again, we start with Equation (5)

The equilibrium of this system is stable in the sense of Lyapunov if there exists a continuously differentiable scalar function

and

If condition (18) is a strict inequality, then the equilibrium point is asymptotically stable.

For the linear system (5), a quadratic Lyapunov function can be chosen

Inserting Equation (5) leads to

This system is asymptotically stable if

or

where

Typically, this equation is used in the forward approach for deriving an unknown covariance matrix

As discussed above, Equation (23) is typically used in the forward approach for deriving an unknown covariance matrix

The stochastic Lyapunov matrix equation presents a genotype-phenotype equation. Here N is the metabolic interaction matrix or stoichiometric matrix which is derived from a whole-genome metabolic reconstruction (see also

Sample classification and rapid diagnostics of newborns based on metabolic profiling using gas chromatography coupled to mass spectrometry (GC–MS) is a very old technology dating back to the 1970s [

The data matrix and the stochastic Lyapunov matrix equation are directly associated by multivariate properties of the system. This applies to any complex network or system and even to model predictive control of artificial intelligence systems. MIMO, multiple input multiple output.

Importantly, the approach is especially suited for non-linear systems, multiple stability, and limit cycle analysis because it uses the data covariance matrix—thus the real data of a biological system where we assume it is in quasi steady state—to predict the system matrix. Thus, any calculated quasi steady state is a realistic solution of the system because it relies on the real data at least in biological systems. This is very much related to model predictive control of non-linear and closed loop artificial systems using the stochastic Lyapunov matrix equation [

The stochastic Lyapunov matrix equation also provides the basis for the functional interpretation of PCA of molecular data. As the covariance matrix _{D} by Equation (16).

Here, _{D}—corresponds to the “scores” [

Now, we can insert Equation (24) into Equation (23) and obtain a direct relation of PCA and the Jacobian of the data

This equation establishes the direct linkage of PCA and the biochemical Jacobian (

SVD, PCA, and many related tools are basic algorithms for the analysis of large-scale data with millions of data points and/or millions of variables. Therefore, these conceptual Equations allow for the data-driven calculation of solutions of the underlying general system of first-order differential equations once we have assumptions about the underlying interaction networks. Consequently, we can test any complex system for data-driven generic principles.

We have implemented this approach and demonstrated the feasibility to identify causal relationships in highly complex molecular association networks [

Furthermore, the calculation of the Jacobian from the data matrix and the covariance matrix, respectively, allows for the data-driven analysis of stability and control of trajectories of these systems [

Many obstacles remain. The calculation of the Jacobian is highly dependent on the accuracy of the covariance matrix. A perfect covariance matrix will reveal the perfect solution of the Jacobian; however, the data are in most instances too noisy to guarantee the best solution. Therefore, many data points are necessary to establish a reasonable covariance matrix. Furthermore, the calculation is highly dependent on the condition number of the system. Ill-conditioned systems are problematic [

The proposed concept of systematic inverse calculation of the Jacobian from data covariance has not been applied in ecological or population studies to the best of my knowledge. Thus, one further hypothesis is that this concept could be exploited as a general framework from molecular biology to ecological systems analysis. This idea is especially supported by the following studies. In 1970, Gardner and Ashby presented an intriguing analysis of the connectance—nowadays called connectivity (see

with _{j} and the _{jk} describing the effect of species _{jk} = 0) and the type of interaction defines the sign and magnitude of _{jk}. The population average interaction strength mean square value α and the connectance (connectivity)

The interpretation of May is famous:

“Applied in an ecological context, this ensemble of very general mathematical models of multi-species communities, in which the population of each species would by itself be stable, displays the property that too rich a web connectance (too large a C) or too large an average interaction strength (too large an α) leads to instability. The larger the number of species, the more pronounced the effect” [

It is intriguing that this system definition and all subsequent derivations are exactly derived from the same system-theoretical principles described above. In later discussions, Robert May implemented the stochastic Lyapunov matrix equation for the analysis of multispecies models in stochastic environments [

These general principles in studies of ecological systems and population dynamics have been extended intensively investigating the Jacobian of these systems as a central figure [

The main aim of the presented ideas is to demonstrate the common system-theoretical principles of biological and ecological systems. It is intriguing how principles of the analysis of dynamical systems are originated by principles of system theory and later cybernetics and can be applied to the analysis of biological and ecological systems. A main framework is defined by stability analysis and application of the stochastic Lyapunov matrix equation. The inverse application of stochastic Lyapunov matrix equation allows for the data-driven inverse modeling of complex systems from molecular to population networks and will be further explored for the prediction of regulatory structures in these highly complex and intuitively inaccessible systems.

The author confirms being the sole contributor of this work and has approved it for publication.

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

I would like to thank Xiaoliang Sun and Thomas Nägele for all these inspiring discussions that we shared in the last years and Gerhard Sorger for helpful comments on the manuscript.