^{1}

^{*}

^{1}

^{2}

^{*}

^{1}

^{2}

Edited by: Xiaogang Wu, Institute for Systems Biology, USA

Reviewed by: Tianshou Zhou, Sun Yat-Sen University, China; Zuxi Wang, Huazhong University of Science and Technology, China; Taichi Haruna, Kobe University, Japan

*Correspondence: Masaki Nakagawa

Yuichi Togashi

This article was submitted to Systems Biology, a section of the journal Frontiers in Physiology

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Cell activities primarily depend on chemical reactions, especially those mediated by enzymes, and this has led to these activities being modeled as catalytic reaction networks. Although deterministic ordinary differential equations of concentrations (rate equations) have been widely used for modeling purposes in the field of systems biology, it has been pointed out that these catalytic reaction networks may behave in a way that is qualitatively different from such deterministic representation when the number of molecules for certain chemical species in the system is small. Apart from this, representing these phenomena by simple binary (on/off) systems that omit the quantities would also not be feasible. As recent experiments have revealed the existence of rare chemical species in cells, the importance of being able to model potential small-number phenomena is being recognized. However, most preceding studies were based on numerical simulations, and theoretical frameworks to analyze these phenomena have not been sufficiently developed. Motivated by the small-number issue, this work aimed to develop an analytical framework for the chemical master equation describing the distributional behavior of catalytic reaction networks. For simplicity, we considered networks consisting of two-body catalytic reactions. We used the probability generating function method to obtain the steady-state solutions of the chemical master equation without specifying the parameters. We obtained the time evolution equations of the first- and second-order moments of concentrations, and the steady-state analytical solution of the chemical master equation under certain conditions. These results led to the rank conservation law, the connecting state to the winner-takes-all state, and analysis of 2-molecules

Biochemical systems consist of a variety of chemicals, including proteins, nucleic acids, and also small metabolites. Enzymatic reactions, which play an important role in catalyzing many biological reactions, are particularly important to maintain the structure and activity of these systems. Hence, biochemical systems are often modeled as catalytic reaction networks.

These networks are typically analyzed by using deterministic ordinary differential equations with respect to concentrations of chemical species, so-called reaction rate equations [or partial differential equations (reaction-diffusion equations) for spatially distributed, non-uniform cases]; i.e., the concentrations are represented by continuous variables. However, because each chemical actually consists of molecules, the concentration of each species should be a discrete variable. The effects of such discreteness as well as finite-size fluctuations in stochastic reactions become non-negligible if the number of molecules in the system is small. In theory, situations such as these can result in phenomena that cannot be described by rate equations (as well as those equations with additional noise; Togashi and Kaneko,

In contrast, gene regulations are often modeled as a combination of binary (on/off) switches, typically represented by Boolean network models (Kauffman,

Recent experiments have shown the existence in the cell of proteins that only consist of a few molecules each (Taniguchi et al.,

The effects of small numbers in particular systems, e.g., small autocatalytic systems have been mathematically analyzed (Ohkubo et al.,

Our framework provides good operability because our formulas have a specific and satisfyingly simple form, and enables us to obtain the steady state for a wide class of catalytic reaction networks because our framework never specifies any parameters for the networks. We use a probability generating function approach. The probability generating function approach to stochastic chemical kinetics itself has been proposed a long time ago (e.g., Krieger and Gans,

The present paper is organized as follows. In Section 2.1, we define the catalytic reaction network considered in this paper. The chemical master equation (CME) is provided in Section 2.2. We introduce the probability generating function (PGF) and derive the generating function equation (GFE) in Section 2.3. In subsequent Sections (2.3.1–2.3.3), we show that the GFE introduces the time evolution equation of the first-order and second-order moments of concentrations (we refer to the first-order moment time evolution equation as the pre-rate equation, PRE), and the second-order moment expression of time-averaged concentrations (SME). Section 2.4 is devoted to obtaining the steady-state solutions of the GFE. To simplify the GFE, we neglect the non-catalytic reactions considered as perturbation for the catalytic reaction network if the system is “entirely ergodic.” In Section 2.4.2.3 as the main result, we obtain the probability generating function without winner-takes-all states (PGFwoWTAS) including the solutions of the corresponding rate equation. In Section 3, we describe applications of these results: the rank conservation law, the connecting state to the winner-takes-all state, analysis of 2-molecules

Consider an abstract catalytic reaction network consisting of

Catalytic reactions (two-body catalysis):

where the species _{ijk} > 0. If this catalytic reaction does not exist, we specify _{ijk} = 0. Therefore, the catalytic reaction networks are determined by _{ijk}. In this paper, we impose the following conditions for the catalytic reaction networks;

_{iik} = 0; Substrate ≠ Catalyst.

_{iji} = 0; Substrate ≠ Product.

_{ikk} = 0; Autocatalytic reactions are not included.

#{_{ijk} > 0 (∀

One product against a substrate and a catalyst.

Non-catalytic reactions (one-body reactions):

This reaction exists for all combinations between each species, but a product _{ijk} > 0}.

The state of this catalytic reaction network is specified by the combination of _{1}, _{2}, ⋯ , _{M}), where _{i} ∈ [0, _{M, N} (abbr. _{M, N} (abbr.

In the present paper, we are interested in the _{i} = _{i}/

The rate constant _{ijk} in the catalytic reaction is defined as the number of reactions per unit volume, unit concentration, and unit time. Therefore, the number of reactions per unit time in the catalytic reaction _{i} and _{j}, is
_{M, N}.

The probability generating function (PGF) is useful to analyze the CME:

Note that the following expressions are translated to differential forms of the PGF:

Once we obtain the PGF ϕ as a solution to the GFE, we can derive all statistics of the catalytic reaction network; for example, the ensemble averages (first-order moments) and second-order moments become
_{i}(

Rate equations are differential equations for the concentrations _{i} of chemical species

Differentiating both sides of Equation (11) by _{i}, substituting _{i} = _{i}/

If the independence 〈_{i}_{j}〉 = 〈_{i}〉〈_{j}〉 (

We suppose the following ergodicity to replace ensemble-averages with time-averages; _{*}(

If the independence

Determining the concentrations

Differentiating both sides of the GFE (Equation 11) by _{l} and _{m} (_{i} = _{i}/_{l}, substituting _{i} = _{i}/_{i}_{j}_{k}〉(

If the GFE (Equation 11) can be solved, this would enable us to obtain all the statistics of the catalytic reaction networks. However, it is generally difficult to solve. Here, we focus on the steady-state solutions of the GFE and consider the case that the ε-term in the GFE can be ignored. Through the following discussion, we see that the approximation is effective only if the system is ergodic.

First, we consider the steady-state solutions of non-catalytic reactions only as an introduction. The PGF _{i} must be zero,

Considering that the PGF _{i} and must satisfy the condition

If we suppose the ergodicity _{i} = _{i}/

Next, we consider the steady-state solutions of catalytic reactions only, assuming that the ε-term in the GFE (Equation 11) can be ignored. The steady-state solutions are assumed to have a form similar to Equation (26), including undetermined coefficients (λ_{i}) deriving from the network structure (_{ijk}).

The PGF _{*}(1) = 1. Substituting Equation (31) into Equation (31) and setting the coefficients of variables _{i}_{j} as zero, gives the following condition for {λ_{i}λ_{j}}:
_{i}λ_{j}; therefore, λ_{i} can be calculated by combining the λ-condition (Equation 33) with Equation (32). The λ-condition has trivial solutions:

The λ-condition (Equation 33) can be rewritten in matrix form; i.e., in the case of

The 3 × 3 matrix (

The proportionality constant (> 0) can be determined by the condition (Equation 32), and thus the desired non-trivial solution of the λ-condition (Equation 33) for _{1} = 0 (where Λ_{2} and Λ_{3} are not zero) implies the existence of a trivial solution (λ_{1}, λ_{2}, λ_{3}) = (1, 0, 0). In another example, the case of Λ_{1} = Λ_{2} = 0 implies that the denominator becomes zero; thus, the expression becomes indefinite.

We are interested in those states in which any species does not take all molecules, because the actual simulations are performed by using the initial states excluding the winner-takes-all states. The PGF without the winner-takes-all states is represented by a linear summation of winner-takes-all states _{1}+⋯+_{M+1} = 1. The

Therefore, we obtain the desired PGF without the winner-takes-all states (PGFwoWTAS);

If we suppose the ergodicity

The above equations indicate that λ_{i} means the concentration per total density ρ in the continuous limit _{i} should be the solution of the classical rate equation. We can also calculate the second-order moments

In the continuous limit _{i}λ_{j} and 0, respectively, that is, the concentrations become mutually independent without fluctuating variables.

We compare our formulas with simulation results that are obtained by applying the Gillespie algorithm (Gillespie,

_{123} = 1, _{132} = 1, _{213} = 1, _{231} = 1, _{312} = 1, _{321} = 0, _{123} = 0, _{132} = 0, _{213} = 1, _{231} = 1, _{312} = 2, _{321} = 1, and _{123} = 1997/3, _{132} = 1000/3, _{231} = 1, _{321} = 1, _{213} = 0, _{312} = 0.

The marginal distributions are shown in Figure

_{i} = _{i}/^{8}, the number of reactions for transient exclusion: 10^{7}, and the initial value (_{1}(0), _{2}(0), _{3}(0)) is randomly selected from _{1} = 2/11, λ_{2} = 3/11, and λ_{3} = 6/11.

_{i}(^{8}. The initial values are randomly selected from ^{7}. Lines in each figure represent the theoretical expressions, Equations (45a) and (45c), for λ_{1} = 2/11, λ_{2} = 3/11, and λ_{3} = 6/11. One can see that the rank of concentrations is conserved but the rank of variances is exchanged between

Note that if the λ-condition (Equation 33) does not have a non-trivial solution like Equation (39), the expression for the PGF (Equation 42) cannot be applied. Such special cases are treated in the following section.

The PGFwoWTAS (Equation 42) would be applicable if the catalytic reaction network was “entirely ergodic,” which means the following in this paper (it is reminiscent of the ω-limit set):

We use a specific three-species system to intuitively illustrate what Equation (48) means. In the case of the three-species system of Figure _{321} in the case of the network of Figure _{0} ∈ _{k} = 0 is allowed, and the second condition represents that at least one direction for approaching the boundary _{k} = 0 is allowed. Note that the condition (Equation 50) is no longer a sufficient condition for entire ergodicity in the case of four-species systems. In fact, in the following four-species system (Figure _{k}(0) > 0(∀_{1} = _{2} = 0, _{3} + _{4} =

_{321} since _{321} = 0. Note that the state point on the boundary (e.g., _{1} = 0) cannot move parallel to the boundary (in this case, the directions of _{213} and _{312}).

_{132} > 0, _{143} > 0, _{231} > 0, _{243} > 0,_{314} > 0, _{413} > 0, and others 0. _{k} = 4 (∃^{(2)}(1, 2), the directions of _{231}, _{241} as well as _{132}, _{142} are not allowed).

In the case of ^{(1)} and _{ijk} > 0 for all _{M = 4, N} (see Figure ^{(4)}(1, 2, 3, 4) is the interior of the regular tetrahedron; ^{(3)}(1, 2, 3), ^{(3)}(1, 2, 4), ^{(3)}(1, 3, 4), and ^{(3)}(2, 3, 4) are regular triangles that form the boundaries of ^{(4)}; and ^{(2)}(1, 2), ^{(2)}(1, 3), ^{(2)}(1, 4), ^{(2)}(2, 3), ^{(2)}(2, 4), and ^{(2)}(3, 4) are line segments that form the boundaries of ^{(3)}. Note that _{1}, _{2}) ≠ (_{1}, _{2}) is possible but ^{(l)} _{1}, ⋯ , _{l−1}}⊂{_{1}, ⋯ , _{l}}.

We could not obtain the steady-state solution of the general GFE (Equation 11) in the case of catalytic-noncatalytic mixed reactions (ε > 0), but we expect our PGF (Equation 42) to be a good approximation for mixed-reaction systems if ε is sufficiently small (ε ≪ min{_{ijk} > 0}). More specifically, we expect the PGF (Equation 42) to be robust against non-catalytic reactions if the catalytic reaction system constituting the mixed reaction system has an ergodic component spread across the entire state space (

Figure _{i} for the three-species system of Figures _{i} in Figure

_{i} (

The starting point of our analysis is the GFE (Equation 11), from which several useful formulas are derived, namely the PRE (Equation 15), SME (Equation 18), TESMs (Equation 19) and (Equation 20), the λ-condition (Equation 33), and the PGFwoWTAS (Equation 42). In this section, we reveal the effectiveness of these formulas by showing important applications for several catalytic reaction networks.

We show that the rank of concentrations is conserved even if the total number of molecules changes in catalytic reaction networks (excluding non-catalytic and auto-catalytic reactions).

Suppose the concentration of the _{1}, λ_{2} be the concentrations per total density in the continuous limit such that λ_{1} < λ_{2}. Because _{1} + λ_{2} < 1 must be satisfied. Therefore, the following evaluation holds:

Note that the rank of the variances of concentrations is generally not conserved when the total number of molecules changes. For example, let us consider the following three-species system (Figure _{i}], represented by Equations (45a) and (45c) with λ_{1} = 1/1000, λ_{2} = 1/3, and λ_{3} = 1997/3000. The rank of time-averaged concentrations is always conserved, but the rank of variances is exchanged at certain

_{1} = 1/1000, λ_{2} = 1/3, and λ_{3} = 1997/3000. It is clear that the rank of time-averaged concentrations is conserved.

There exists an

We show this by taking the following limit in the PGFwoWTAS (Equation 42):
_{i}} are positive constants, which are determined from the network structure {_{ijk}} (it is explained later). Evidently, the following holds;

The PGFwoWTAS (Equation 42) has the following limiting expression:

The state corresponding to the PGF (Equation 60) is the connecting state to the winner-takes-all state (CStoWTAS). The stationary distribution corresponding to the CStoWTAS is immediately obtained:
_{M, N} (abbr.

Furthermore, the marginal distributions of the

where δ_{ij} is the Kronecker delta, δ_{ij} = 0 (if

Next, we derive the relation between the positive constants {κ_{i}} and the network structure {_{ijk}}. The λ-condition (Equation 33) can be converged to conditions for {κ_{i}} (the κ-condition) by taking the limit λ_{1} → 1 as follows. Divide the λ-condition (Equation 33) into two groups, of which one group is the case of 2 ≤ _{i} = κ_{i}(1− λ _{1}), _{1}. Then, taking the limit λ_{1} → 1 in Equation (66a),

The condition (Equation 67a) expresses that the 1st-species cannot be a substrate, and the condition (Equation 67b) represents the desired κ-condition. Note that the CStoWTAS must be a limiting state corresponding to the PGFwoWTAS [see Equation (60)]. Therefore, the network structure {_{ijk}} must have definite {λ_{i}}.

For example, let us consider the following three-species system (Figure _{2} = 2/3 and κ_{3} = 1/3. Obviously, this system is not weakly reversible, which means that the theorems (Theorem 4.1 and 4.2) in the previous study (Anderson et al., _{1}, λ_{2}, and λ_{3} are definite from Equation (39) with Λ_{1} = 0, Λ_{2} = 2, and Λ_{3} = 4. As shown in Figure _{i}] represented by Equation (65). In this case also, the rank conservation law of concentrations holds.

_{2} = 2/3 and κ_{3} = 1/3.

The 2mTESM (Equation 20) becomes the closed equation of the second-order moments 〈_{l}_{m}〉 if the first-order moments 〈_{i}〉 are substituted by the second-order moments according to the PRE (Equation 20). In this subsection, we consider catalytic-non-catalytic mixed reaction systems of

We first focus on the second formula in the 2mTESMs (Equation 20). It can be seen that each variance of time-averaged concentration _{i} becomes larger as its time-averaged concentration approaches
_{1}_{3}〉 = ⋯ and so on]. By the SME (Equation 18), the concentrations become
_{1} = 2/11, λ_{2} = 3/11, and λ_{3} = 6/11. The non-catalytic reaction rate constant ε seems to be a singular perturbation against the second-order moments (not the concentrations).

_{1} = 2/11, λ_{2} = 3/11, λ_{3} = 6/11,

The framework we developed in this paper applies to non-autocatalytic reaction networks. However, our framework may be applicable to autocatalytic reaction networks if it were possible to convert autocatalytic to non-autocatalytic networks. Here, we show several examples of such conversions using a minimal autocatalytic reaction network “2TK model” (Ohkubo et al.,

The 2TK model consists of only two species, and includes both autocatalytic reactions (rate const. _{A} = _{1}+_{3}+_{5}/2 and _{B} = _{2}+_{4}+_{5}/2 in the five-component model is similar to that of the 2TK model (compare with Figures 1A,B in Saito and Kaneko, _{1}, λ_{2}, ⋯ , λ_{5}), which immediately correspond to the stationary states of catalytic reactions in the five-component model by using Equation (43):

where each case (i-vi) corresponds to (i) _{6} = _{5} = _{1} = 2_{2} = 2_{3} = 2_{4} = 2_{i} = 1 (others 0) of catalytic reactions in the five-component model, which are sometimes caused by non-catalytic reactions. Furthermore, the marginal distribution of the species _{A}(_{1}(_{3}(_{5}(_{1}(_{3}(_{5}(

_{142} = _{153} = _{231} = _{254} = _{351} = _{452} = 1 (others 0). If one regards the species 1, 3, and half of 5 as the species _{A} = _{1}+_{3}+_{5}/2 and _{B} = _{2}+_{4}+_{5}/2 is similar to that of the 2TK model.

_{1}+_{3}+_{5}/2)/_{i}(^{8}, the number of reactions for transient exclusion: 10^{7}, and the initial value is randomly selected from

Other non-autocatalytic reaction networks duplicating the 2TK model are shown in Figure _{i} = 1 (others 0) of catalytic reactions in the four-component model, which are sometimes caused by non-catalytic reactions. We also confirmed that the results are not changed even in the other four-component model of Figure

_{123} = _{142} = _{214} = _{231} = _{321} = _{412} = 1 (others 0), and (B) _{142} = _{143} = _{231} = _{234} = _{341} = _{432} = 1 (others 0)_{A} = _{1}+_{3} and _{B} = _{2}+_{4} is almost equivalent to that of the 2TK model.

_{1}+_{3})/_{i}(

The framework we presented in this paper facilitates the prediction of the effect of the small-number issue on the concentration of each species in catalytic reaction networks. This can be described in an extreme manner by comparing the concentrations between the continuous limit (

One might think that the analysis presented in this paper can be straightforwardly extended to the case including autocatalytic reactions [in fact, the CME (Equation 7) and GFE (Equation 11) themselves hold even in the case including autocatalytic reactions]. However, if autocatalytic reactions are included (i.e., the case _{ikk} > 0 is allowed), we cannot consider catalytic reactions and non-catalytic reactions to be completely separate. The reason is that, in the case including autocatalytic reactions, the absence of non-catalytic reactions generally implies winner-takes-all steady states. Generally, solving the CME (or GFE) of catalytic-noncatalytic mixed reactions systems is more advanced and a more difficult task than that of catalytic reactions only. The proposed strategy, i.e., non-autocatalyzation conversions, is one of our ideas to address the problem.

The formulas obtained in the present work are specific and satisfactorily simple. Therefore, our theory has the capabilities to be developed into a general theory for catalytic reaction networks. On the other hand, there exists a mathematical theory for a certain class of catalytic reaction networks that are “weakly reversible” and “deficiency zero” (Anderson et al.,

Actual biochemical pathways in the cell involve thousands of chemical species, and their chemical properties vary. Our theoretical framework is general and extensible to such complex reaction networks, if they can be represented by CMEs such as Equation (7). As our current model consists of simple two-body catalytic reactions, it is difficult to point out examples in actual biological systems that correspond exactly to our model. Biochemical reactions in reality may involve a number of intermediates. There are also autocatalytic processes such as autophosphorylation, and replication of templates such as DNA, in which the catalyst or template species is also a substrate or a product. Our framework is applicable to many such cases involving network conversion, as shown for simple autocatalytic cases.

Nevertheless, the reaction kinetics of each enzyme is not always simple. Enzymes are complex macromolecules and their reaction cycles may depend on their conformational states. Therefore, the prediction of biological phenomena caused by small-number effects in real biochemical reactions, would entail further analytical challenges for catalytic reaction networks including arbitrary higher-order mixed reactions (rather than first- and second-order reactions only) or internal dynamics of the enzymes (as modeled and analyzed in Togashi and Casagrande,

Throughout this work, our primary intention is to approach small-number issues in biological systems. One might wonder how general these small-number issues appear, and how important they are, in living cells. Recently, absolute quantification of various proteins and mRNAs in the cell has become possible, and the integration of experimental results (e.g., the construction of a database Milo et al.,

Although eukaryotic cells are much larger than bacteria, they have complex membrane structures and cytoskeletons inside, and the small-number issues can be particularly significant in compartments or bottlenecks (e.g., if we consider the volume of a synaptic vesicle represented by a sphere 40 nm in diameter, then 1 molecule corresponds to ca. 50 μmol/L). Rare proteins are also involved in physiologically important signaling pathways in eukaryotes. In the Wnt signaling pathway, for example, the concentration of axin is reported to be 20 pmol/L in Xenopus eggs (Lee et al., ^{2} molecules (e.g., Ste5) exist in a yeast cell (Thomson et al.,

In the presented framework, we mainly focused on the steady-state solutions of GFE. Of course, temporal courses are biologically crucial in some cases. A well-studied example is oscillatory behavior in circadian clocks (Bell-Pedersen et al.,

Note that a chemical “species” here can also be interpreted as a specific state of a molecule; e.g., we can consider proteins or genes, with and without modification, as separate species. Moreover, a similar interpretation is also applicable to ecology and ethology (Biancalani et al.,

MN and YT conceived the research. MN performed the analysis and simulations. Both authors discussed the results and wrote the paper.

This work was supported by the Ministry of Education, Culture, Sports, Science, and Technology, Japan (KAKENHI 23115007 “Spying minority in biological phenomena”), and Japan Agency for Medical Research and Development (Platform for Dynamic Approaches to Living System).

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

We would like to thank Dr. Nen Saito for valuable discussion and useful comments.