^{1}

^{*}

^{1}

^{2}

^{1}

^{2}

Edited by: Christian Johannes Cyron, Hamburg University of Technology, Germany

Reviewed by: Alireza Yazdani, Brown University, United States; Ercan Gürses, Middle East Technical University, Turkey

This article was submitted to Computational Materials Science, a section of the journal Frontiers in Materials

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

A multi-fidelity surrogate model for highly nonlinear multiscale problems is proposed. It is based on the introduction of two different surrogate models and an adaptive on-the-fly switching. The two concurrent surrogates are built incrementally starting from a moderate set of evaluations of the full order model. Therefore, a reduced order model (ROM) is generated. Using a hybrid ROM-preconditioned FE solver additional effective stress-strain data is simulated while the number of samples is kept to a moderate level by using a dedicated and physics-guided sampling technique. Machine learning (ML) is subsequently used to build the second surrogate by means of artificial neural networks (ANN). Different ANN architectures are explored and the features used as inputs of the ANN are fine tuned in order to improve the overall quality of the ML model. Additional ML surrogates for the stress errors are generated. Therefore, conservative design guidelines for error surrogates are presented by adapting the loss functions of the ANN training in pure regression or pure classification settings. The error surrogates can be used as quality indicators in order to adaptively select the appropriate—i.e., efficient yet accurate—surrogate. Two strategies for the on-the-fly switching are investigated and a practicable and robust algorithm is proposed that eliminates relevant technical difficulties attributed to model switching. The provided algorithms and ANN design guidelines can easily be adopted for different problem settings and, thereby, they enable generalization of the used machine learning techniques for a wide range of applications. The resulting hybrid surrogate is employed in challenging multilevel FE simulations for a three-phase composite with pseudo-plastic micro-constituents. Numerical examples highlight the performance of the proposed approach.

In computer-assisted materials design and in the simulation of complex materials with rich microstructure major challenges remain to be solved despite the outstanding advances made in recent years. For example, the discretization of all microstructural features in a monolithic finite element (FE) simulation is unfeasible due to the various length scales involved that range from micrometers up to the meters. These would lead to a ludicrous complexity of the resulting overall model. By accounting for a separation of length scales, the FE^{2} ansatz (Feyel,

Due to novel improvements in machine learning and computational resources, a zoo of data-driven methods comprising, e.g., kernel methods, principal component analysis, and artificial neural networks, have developed immense momentum over the last years. The successful implementation of these techniques in materials research is an active field. For instance, in Chupakhin et al. (

While data-driven approaches have their appeal, the structure of the underlying physical problem can be accounted for only in parts. For instance, established balanced laws and thermodynamic principles are hard to be incorporated in the aforementioned methods. Reduced order models for the microscopic problem offer an advantageous compromise between physics-informed modeling and computational efficiency. Purely data-driven surrogates lack accuracy (i) if the amount of training data is limited, (ii) if the validity domain is left, or (iii) if the error of the surrogate in respect to the reference solution is to be estimated. In these scenarios, reduced order models following physical principles offer, in general, better accuracy and robustness. For example, in Fritzen and Leuschner (

The present work aims in mechanical multiscale FE simulations at the adaptive combination of the physics-informed reduced order model (ROM) of Fritzen and Kunc (

The manuscript is organized as follows: In section 2, two concurrent surrogate models for the QoI obtained by reduced order modeling and from purely data-driven ANNs are described. The twoscale mechanical problem is introduced and the challenges in the goal-oriented error estimation of derived quantities of interest remaining in nonlinear reduced order modeling are detailed. Then, the data generation for the training of the ANNs is illustrated, followed by the guidelines for the material law and error approximation. At the end of the section, adaptive twoscale simulation strategies including on-the-fly model switching are presented. Section 3 offers numerical examples for a three-phase pseudo-plastic material: The ANN is used for the direct surrogation of the QoI. This is adaptively complemented by a more robust and reliable reduced order model based on the concept of quality indicators. Multiscale FE simulations comparing the different multiscale simulation techniques are presented. The manuscript ends with a concluding summary of the results in section 4.

The simulation of microstructured solids with a sufficient separation of length scales is investigated. More precisely, a macroscopic domain

and, for each macroscopic point

Here

The two BVPs are strongly coupled since the solution

A straight-forward yet computationally costly approach to solving the twoscale problem is given in terms of the FE^{2} method (Feyel, ^{2}.

Twoscale mechanical problem in the context of FE^{2} for a material with a microstructure composed of three material phases: at every integration point of the macroscopic problem

Given the massive computational demands of the FE^{2} technique and the limited availability of computational resources, the use of nowadays established reduced order models (ROM), in order to replace the costly microscopic BVP evaluations, has become an accepted alternative for dissipative and pseudo-plastic hyperelastic materials (Radermacher and Reese,

where

Following Fritzen and Kunc (^{N} solving

While the effective stress is obtained from simple volume averaging of

which follows from straight-forward linearization of (6). The accuracy of the ROM depends on the quality and amount of the snapshots and of the reduced dimension

For the microscopic BVP, using the ROM (or any other approximation of the FOM) naturally introduces an error into the solution of the problem, and into the quantity of interest (QoI). In this work the latter is the effective stress. Hence, in order to enable error control for the macroscopic boundary value problem, it is crucial to estimate the error in the QoI, see, e.g., Larsson and Runesson (^{F}(^{RN}(^{F} and ^{RN} are the solutions to the microscopic problem (2) using the FOM and

where _{e} denotes the nodal values of the fully resolved error. The FOM and

while the corresponding error is addressed as

Now, consider the corresponding FOM residual equation analogous to (6) and the error in the QoI given in (11) in terms of the error in the solution defined in (8). Through linearization, we obtain the error equation and the linearization of the macroscopic stress error

respectively. Here ^{F},

Finally, (12) and (13) can be combined to yield the result

We note that the estimator (14) has, in particular, the following properties: (i) It is restricted to estimating the linearized error contribution, (ii) it requires the assembly of the entire FOM residual and Jacobian, and (iii) it requires the solution of the dual problem using the FOM to formally hold. Even if the linearization error is negligible, the high computational cost involved in assembling the full (FOM) Jacobian and residual of the problem makes this technique unalluring for use in conjunction with highly efficient ROM approximations. Possible approximations of (13) pertain to hierarchical approximations. One could, for instance, solve the dual problem using an enriched ROM, rather than the FOM. However, designing a robust hierarchical scheme requires means of guaranteeing that the enriched basis is sufficient. In view of the discussion above, we shall henceforth consider alternative methods to estimating (and controlling) the error in macroscopic stress from each microscopic problem.

The present work is concerned with materials based on state dependent models for, e.g., pseudo-plasticity. For such material models, see, e.g., Kunc and Fritzen (_{d} almost uniformly distributed unit vectors / directions ^{(i)} ∈ ℝ^{6} (_{d}) are generated. Samples along the generated directions are considered with an exponentially growing step width from the origin. The primal strain dataset

with the primal strain norm discretization D_{r} and set of directions D_{d}. The definition (15) corresponds to a tensor decomposition into direction and amplitude. For many materials the volume changes are rather small compared to isochoric deformations. This effect is particularly pronounced for (pseudo-) plastic materials. In order to sample the strain space in a problem specific manner, a rescaling of the strains defined in (15) may be convenient. The present work solely rescales the spherical part (sph) of each primal strain (i.e., the dilatation), while the deviatoric part (dev) remains unchanged. The actual strain dataset is described by

where _{ε}) is given by the product of number of the directions #(D_{d}) and the number of amplitudes per direction #(D_{r}).

For the training of the artificial neural networks (ANNs), training (T), validation (V), and random (Monte Carlo - MC) datasets, referred to as

Technically, the process of generating the data samples is challenging. In order to obtain reliable data, the FOM and the ROM must be evaluated thousands of times in order to obtain the needed data. Each sample consists of an effective strain

For the successful training of ANNs the normalization of the input and output data and the design of appropriate inputs (usually referred to as features) through linear or nonlinear transformations is essential. Compared to image data and convolutional neural networks, which usually take advantage of the intrinsic connection of image data and convolution, the present input data (strain data) is low-dimensional and necessarily requires sensible mechanical guidance during feature design. From a pure data-driven perspective, general batch normalization can greatly improve the prediction quality of a network. But in the present problem setting the input and output data have a clear physical nature. Therefore, based on mechanical reasoning, the consideration of the dependency of the material law on the spherical (

Additionally, the deviatoric part of the strain can be split into its norm and direction

After either of these transformations, a corresponding normalization is performed in order to prepare the strain features for the subsequent evaluation through the ANN: For ^{sd1}, each component of the vector ^{sd1}^{sd2}, the first component (i.e., the volumetric strain) is scaled according to the standard procedure while the deviatoric strain amplitude is divided by its peak value and the deviatoric direction remains unchanged. In the following the shifted and scaled inputs are referred to as ^{[0]} ∈ ℝ^{D},

In the present work, feedforward neural networks are used. This choice within the plethora of available artificial neural networks is driven by the fact that a function is to be calibrated that depends exclusively on the current state ^{[l]} neurons the inputs ^{[l−1]} ∈ ℝ^{n}^{[l−1]} and outputs ^{[l]} ∈ ℝ^{n}^{[l]} are related by weights ^{[l]} and activation functions ^{[l]} via the recursion

complemented by ^{[0]} =

the identity function (Id)

the rectified linear unit (RELU)

the softplus function (SP)

and the hyperbolic tangent (TANH)

The identity function (Id) allows to pass unaltered input, such that a linear combination of the activation functions of the previous layer is returned. This is particularly desired in the last layer, in order to obtained an optimized linear combination of nonlinear functions as final output ^{[L]} of the ANN. The evaluation of a single input strain through the whole ANN is addressed by the composition of all layers

The training of the ANN requires an objective function that provides an error respecting the nature of the outputs. In the context of ANNs, the objective function is referred to as loss function. Similar to the inputs, the outputs, the effective stress of the FOM

Here, the same transformations ^{sd1} and ^{sd2} as for the inputs are considered for _{σ} during architecture testing. The evaluation of the ANN is analogously abbreviated as

In this work, the mean squared error (MSE) is chosen as the loss function

The MSE (23) is then optimized with respect to the ANN parameters, i.e., the weights and biases are identified starting from a random initialization. The ANN output is then obtained through an inverse transformation

It should be remarked that, from the perspective of physics-informed artificial neural networks, one may also consider the incorporation of the norm of the non-symmetric part of the gradient

The quality of the ANN during training is checked, not with respect to the training dataset, but with the validation dataset

In addition to that, the mean coefficient of determination

The coefficient of determination is bounded by one which is attained if and only if the surrogate coincides with the reference for all queries.

In this section, we are interested in the calibration of ANNs taking strain data as input and delivering quantitative and qualitative error estimates for the stress. On the one hand, for a given strain, it might be of interest to predict the error of stress surrogate against the FOM stress. On the other hand, it might not be of particular interest to know the exact error value, but rather to know if the error is acceptable, i.e., if it is smaller than a prescribed tolerance. The quantitative error prediction leads to a classical regression problem, whereas the binarized response gives rise to an ordinary classification problem.

In the error regression problem, for a given model

For the error classification problem, we consider the indicator function

with prescribed absolute and relatives tolerances τ_{a} and τ_{r}, respectively. The outcome of χ^{M} is particularly useful in order to decide on the subsequent treatment: For χ^{M} = 1, the error is considered acceptable and the surrogate can be used, while χ^{M} = 0 should trigger an adaptive refinement. For instance, the classifier χ^{M} can decide if the stress surrogate

For error regression and classification, the fully connected feed forward ANNs as described by (19) and the same activation functions as in section 2.3.2.2 are used. For the binary classification the final ANN layer is regarded as a log-probability with a single neuron. This setup is usually referred to as

One of the desired properties, considering possible safety requirements in the error regression and classification, is to obtain if not accurate, then at least conservative results. In order to achieve a conservative behavior, for the error regression problem we consider the function

which changes the slope for negative input values to α. The function ϕ_{α} can be used to penalize underestimation of the error (for α > 1) when applied to the scalar argument of the MSE for the true error ^{M} (representing the absolute error ^{M}

The MSE_{α} is considered as the loss function for error regression, where α acts as a penalty parameter. The corresponding ^{2} value and the relative conservative amount (RCA) over the validation dataset

are used to assess the quality of the prediction.

For the error classification of model M ∈ {R

The loss function for classification chosen in this work is the weighted binary cross entropy

Herein, false positive predictions dominate the cross entropy for ^{M} and in the surrogate

Further, the accuracy within the bin

The reader should note, that ACC_{0} is more relevant when seeking conservative estimates. Only if ACC_{0} and ACC_{1} are close to unity, then the overall classification is robust, while for seemingly good ACC (e.g., around 0.98) the critical ACC_{0} could be inappropriate. This effect is particularly important if the surrogate has only few outliers requiring further processing.

In order to build a twoscale simulation model relying on the finite element method on the larger scale, the material model must be replaced by the homogenized response of the heterogeneous solid. In sections 2.1.2 and 2.3.2 the use of ROM and ANN serving as surrogates for the effective stress tensor and the effective tangent stiffness are described in detail. Both surrogates can be combined by introducing an indicator function

First, a simple ansatz for χ is chosen by setting χ to one if the current strain at the macroscopic position

Here, _{W} denotes a weighted norm that transforms elements of D_{ε} defined via (16) back into normalized directions:

The use of the ROM outside of the training domain is motivated by its reluctance to energy minimization, i.e., by preserving the key physical characteristics of the full order model while restricted to a relevant subspace of the solution manifold.

A second indicator can be obtained by evaluating the accuracy of the ANN. Therefore, a binary classifier

At first, the concept of the indicator function χ marking the confidence region for the ANN and employing the ROM elsewhere sounds straight-forward. However, this simple approach does not work in practice as the two concurrent surrogates do not provide continuous approximations of the stresses. This can be illustrated by letting

Macroscopic FE boundary value problem ^{K} checks if

In the context of twoscale simulations, the problem is not the accuracy/fidelity of the training data of the microscopic problem, but (1) the usage of a surrogate outside of its training range (based on χ^{K} for the ANN stress surrogate) and (2) the point-wise quality of the surrogate with respect to prescribed tolerances (

The first approach consists of a staggered procedure, where the ANN is used as the only stress surrogate in a first run of the twoscale simulation (see Algorithm 1). Thereby, a first overall response is gathered. This is followed by a second run, in which the subset of all integration points having seen a zero quality indicator during any of the load steps of the first run are enforced to use the ROM surrogate. This set is then kept constant, i.e., switching from ANN to ROM is one way. This procedure enables the use of the ANN solution as an initial guess for the subsequent hybrid run which leads to low iteration counts and improved performance. During the second run, the difference of the ANN and the ROM can be evaluated to provide valuable post-processing data in order to better understand the quantitative impact of the model modifications, see also examples in section 3.3.2. Two major disadvantages of this approach are (i) the irreversibility of the ROM activation which can lead to substantial computational costs and (ii) the possible failure during the first run, if the ANN surrogate becomes non-convergent. The latter can, e.g., occur if the local magnitude of

Staggered hybrid ANN/ROM twoscale simulation algorithm.

A second on-the-fly model selection procedure, solving both of the aforementioned issues, is described in Algorithm 2: It re-initializes the quality indicator in favor of the ANN at the beginning of each load increment. During the subsequent non-linear Newton-Raphson iterations of the same increment, the indicator is updated in a monotonic way, i.e., switching from ANN to ROM is allowed but not vice verse (see line 11 in Algorithm 2). The computational efficiency can be improved by substituting only part of the equilibrium iteration by the ROM.

Adaptive on-the-fly ANN/ROM twoscale simulation algorithm.

An artificial heterogeneous solid consisting of three phases is investigated. It consists of a laminate structure of two pseudo-plastic materials where the two layers share the same elastic parameters (_{1} = _{2} = 75 GPa, ν_{1} = ν_{2} = 0.3) but have different yield strength and hardening behavior: The first layer has a yield stress of 100 MPa and a linear hardening slope of 2,000 MPa, whereas the second layer has a yield stress of 115 MPa in the absence of hardening. The third phase is represented by a spherical inclusion that is centered on the interface of the two phases. The inclusion is assumed linear elastic with properties mimicking a ceramic inclusion made of SiC (

The strain space is sampled as described in section 2.3.1 for an effective strain amplitude discretization D_{r} = {0.0005, 0.002, 0.0035, 0.005, 0.0075, 0.01, 0.015, 0.025, 0.04}. The spherical / volumetric part of the primal strain dataset is rescaled with ^{6}.

An initial architecture testing phase is conducted. The activation functions and transformations illustrated in section 2.3.2 are considered, together with varying number of layers and neurons. The architecture test with ^{[l]} ∈ {16, 32, 64, 128}, ^{sd1} for the input as well as for the output. The transformation ^{sd2} did not show major advantages in the final objective function values.

Based on the initial architecture testing, the softplus function (SP) has been chosen to power further investigations, due to its monotonic and differentiability properties in regard of an expected monotonic stress behavior and need for tangent operators for future FE multiscale computations. In _{MC} and

ANNs for the effective stress surrogate with corresponding choice of input features, network architecture, intermediate transformation of stress data _{σ}, measures MRNE and _{MC} and

_{σ} |
_{MC} |
||||||
---|---|---|---|---|---|---|---|

^{sd1} |
{5 × 128(SP)−6(Id)} | ^{sd1} |
0.0189 | 0.9995 | 0.0183 | 0.9995 | |

^{sd1} |
{5 × 64(SP)−6(Id)} | ^{sd1} |
0.0204 | 0.9995 | 0.0200 | 0.9994 | |

^{sd2} |
{5 × 64(SP)−6(Id)} | ^{sd1} |
0.0241 | 0.9995 | 0.0241 | 0.9995 | |

^{sd1} |
{5 × 16(SP)−6(Id)} | ^{sd1} |
0.1578 | 0.9768 | 0.1564 | 0.9751 |

Effective strain load directions

dirT12 | (– 0.10 | – 0.07 | 0.15 | 0.96 | 0.11 | 0.16) |

dirT23 | (– 0.03 | – 0.10 | – 0.05 | 0.00 | 0.08 | 0.99) |

dirTmixed | (– 0.12 | 0.03 | – 0.03 | 0.48 | – 0.16 | 0.85) |

dirV12 | (– 0.11 | – 0.15 | 0.27 | 0.89 | – 0.27 | – 0.18) |

dirV23 | (– 0.11 | 0.02 | – 0.07 | – 0.12 | – 0.08 | 0.98) |

dirVmixed | (0.02 | – 0.31 | 0.24 | 0.04 | – 0.12 | 0.91) |

Von Mises effective stress vs. effective strain norm for ANN1 for the 3 loading directions of the training dataset

For the error regression and classification, it is first necessary to gain an overview regarding the quality of the

In

Cumulative distribution function of the errors of ANN and ROMs of dimensions 16 to 96: distribution of ANE

We first demonstrate the error regression in terms solely of the ^{[l]} ∈ {16, 32, 64}, up to 10,000 epochs and whole batch training is performed. A selection of the trained ANNs is tabulated in

ANNs for error regression with corresponding choice of input feature, network architecture, penalty parameter α, corresponding quality indicators _{e} for the validation dataset and _{eMC} for the MC dataset.

_{e} |
_{eMC} |
||||||
---|---|---|---|---|---|---|---|

^{sd1} |
{4 × 64(SP)−1(Id)} | 3 | 0.9868 | 0.7924 | 0.9892 | 0.8082 | |

^{sd1} |
{4 × 64(RELU)−1(Id)} | 1 | 0.9904 | 0.5116 | 0.9921 | 0.5371 | |

^{sd2} |
{4 × 64(TANH)−1(Id)} | 3 | 0.9733 | 0.7948 | 0.9741 | 0.7995 | |

^{sd2} |
{4 × 64(TANH)−1(Id)} | 1 | 0.9906 | 0.5305 | 0.9895 | 0.5206 | |

^{sd1} |
{5 × 64(RELU)−1(Id)} | 3 | 0.8525 | 0.7323 | 0.8642 | 0.7227 | |

^{sd1} |
{5 × 64(RELU)−1(Id)} | 1 | 0.9080 | 0.5104 | 0.9259 | 0.4957 | |

^{sd1} |
{5 × 64(RELU)−1(Id)} | 3 | 0.7822 | 0.7562 | 0.8316 | 0.7574 | |

^{sd1} |
{5 × 64(RELU)−1(Id)} | 1 | 0.8923 | 0.4884 | 0.9170 | 0.5002 |

The ANNs _{r}, as tabulated in

Correlation plots for the ROM16 ANE and corresponding error regression ANNs in the range [0,40] MPa:

The error classification is conducted for the absolute and relative tolerances τ_{a} = 2MPa and τ_{r} = 0.02, respectively. Architecture testing for ^{[l]} ∈ {16, 32, 64} for the hidden layers yield varying quality of results depending on the weight

is considered. If the number of negative outcomes in the training data _{0} > 1 holds. The consideration of _{0} in the binary cross entropy partly equilibrates the influence of the false positive (i.e., classified accurate but violating the tolerance) and false negative (i.e., classified inaccurate but within tolerance). But it may also overly bias the cross entropy during training, yielding poor accuracy in one bin. Therefore, _{0} in four evenly spaced steps during architecture testing. A selection of trained ANNs is tabulated in

ANNs for error classification for _{a} = 2MPa and τ_{r} = 0.02.

_{0} |
_{0} |
_{1} |
|||||
---|---|---|---|---|---|---|---|

2.9184 | ^{sd1} |
{5 × 64(TANH)−1(Id)} | 1 | 0.9282 | 0.9401 | 0.8905 | |

1.6328 | ^{sd1} |
{5 × 64(TANH)−1(Id)} | 1.4746 | 0.9047 | 0.9077 | 0.8999 | |

0.0047 | ^{sd2} |
{3 × 64(RELU)−1(Id)} | 0.0047 | 0.9483 | 0.7778 | 0.9495 | |

0.1256 | ^{sd2} |
{5 × 64(RELU)−1(Id)} | 0.3442 | 0.8611 | 0.6721 | 0.8831 |

Classification ANNs with acceptable accuracy with respect to the validation dataset are obtained for the 16-, 24-, and even for the 32-dimensional ROM. These ANNs, denoted as

The presented hybrid methods introduced in Algorithms 1 and 2 are used in actual three-dimensional twoscale simulations. The results are compared to FE^{2R} simulations (in the spirit of Fritzen and Hodapp,

The macroscopic problem

First the staggered procedure introduced in Algorithm 1 is used. It is found that the first run that is relying on the ANN surrogate only achieves excellent runtimes when evaluating the ANN on graphics cards (here: one Nvidia GTX Titan Black), leading to runtimes of approximately 15 s for one evaluation of the surrogate at each of the 430,320 integration points of the finest mesh M3. It shall be noted that this includes a major execution overhead^{1}

A general dilemma of twoscale simulations that was observed for the FE^{2R} method by Fritzen and Hodapp (^{2}

In view of the number of quadrature points marked for use of the more reliable ROM, the adaptive scheme shows a steady increase when using the kinematic indicator χ^{K} marking points outside of the training range as not trustworthy for the ANN.

The crucial ingredient of the on-the-fly adaptive scheme, described in Algorithm 2, is the irreversible update of the quality indicator during each load increment. Thereby, alternating model selection is prevented. All three macroscopic models, M1, M2, and M3, converged without any issues. The resulting macroscopic tension force of all three is compared in ^{2R} curve for the ROM featuring 32 modes are also shown for M1. It is observed from ^{2R} and adaptive algorithm have nearly indistinguishable slopes (despite a negligible shift), whereas the ANN model is slightly curved, i.e., it shows a qualitative difference toward the reference solution which gets more pronounced at increasing load amplitude.

^{2R} with 32 modes.

The adaptive algorithm has the advantage that the number of macroscopic integration points that require evaluation of the ROM depends only on the current state. For the considered proportional loading, and when using the kinematic indicator χ^{K}, the relative amount of integration points grows monotonically with increasing load, cf.

^{K} at the end of the adaptive twoscale simulation using mesh M3.

In order to investigate the practical usefulness of the ANN classifier, a comparison of the on-the-fly adaptive simulation using the kinematic quality indicator and the same simulation supplemented by the ANN classifier discussed in section 3.2.2 is considered. As expected, the solid yet not overly satisfying accuracy of the ANN classifier (see

Comparison of the final quality indicators at the end of the adaptive simulation for mesh M1: kinematic classifier only

A multi-fidelity approach for generating surrogate models of the effective stress tensor for the use in twoscale simulations is developed in section 2.1. At first, a ROM is derived from data gathered during full field simulations. The estimation of the error in the effective stress tensor (representing the QoI) of the ROM is discussed from a theoretical perspective in section 2.2. The mathematical structure of the error estimate reveals, that the ROM error estimation produces computational cost that is almost equivalent or even beyond that needed to solve a more dedicated ROM, thereby making it hard to justify such estimates when in the need for computational efficiency.

In our view this dilemma can only be resolved by finding alternative surrogates with low computational complexity but moderate to good accuracy complemented by adaptive strategies for local model refinement that employ costly computational methods only when needed. In this regard, ANNs are seen as promising candidates for the calibration of surrogate models for the effective stress and for classification that can trigger adaptive refinement. In section 2.3, the layout and the theoretical background of ANNs are discussed, together with different feature designs for the inputs and outputs based on the mechanical nature of the strain and stress. For the calibration of the stress surrogate, the mean squared error is used as loss function, while the quality of the trained ANN is checked on the validation dataset with the mean coefficient of determination and the mean relative norm error. In the case of error regression, a penalized mean squared error is proposed, which allows the conservative calibration of trained ANNs. For the error classification based on prescribed tolerances, the weighted cross entropy is used in order to allow for a better focus on the more important warning case, if the warning case density is low. Based on the proposed models, the core contribution of the present work constitutes two model-adaptive algorithms which encompass convergence issues encountered in the naive implementation of on-the-fly adaptive surrogate selection, see section 2.4. The first staggered algorithm is based on a two run approach, in which the first run is conducted solely with the ANN effective stress surrogate and flags points evaluated outside of the strain training region, such that only these points are evaluated with the high-accuracy ROM in a second run. The second algorithm offers a more flexible on-the-fly model-adaptive approach by allowing the re-initialization of the ANN at the beginning of each load increment.

Numerical examples of the illustrated approaches are presented in section 3 for a three-phase pseudo-plastic material with microstructure. First, ANNs are trained in order to approximate the effective stress. The surrogate of choice,

The trained ANNs are then used in twoscale mechanical FE simulations, based on the two developed algorithms of section 2.4. The staggered algorithm produces sensible results but has two limitations: First, the number of macroscopic quadrature points marked for correction grows irreversibly. Second, the ANN surrogate must be sufficiently robust and of—at least—moderate accuracy in a prohibitive part of the strain space. This requirement stems from the fact that local strain outliers lead to queries that are way outside of the usual training range of the ANN. This effect is found to be more pronounced when the macroscopic mesh density is increased which further complicates the robust surrogate construction using purely data-driven methods in general, see section 3.3.2. The second algorithm offers a true on-the-fly adaptivity in which the ANN surrogate can be recovered, e.g., during unloading. It is observed in section 3.3.3 that this second algorithm offers the fastest convergence among the considered twoscale simulations being approximately 3–10 times faster than the staggered algorithm and around 20 times in comparison to the fully coupled FE^{2R} algorithm using the ROM with 32 modes for all stress predictions. The adaptive on-the-fly model of the second algorithm offers, therefore, an attractive approach which combines a low number of ROM evaluations with good convergence.

The final test using the additional error classifier for the ANN stress surrogate introduced a high number of additional negative outcomes (i.e., ANN error greater than tolerances), considerably increasing the number of integration points requiring the ROM. This was expected due to the low accuracy achieved during the training of the classifier, more specifically, due to the low accuracy for the positive outcome ACC_{1} and corresponding high amount of positive outcomes reflected by _{0}, see

FF conducted the finite element and reduced order model computations on the microscale and the twoscale simulations based on his self-developed code and wrote the corresponding sections. MF implemented, trained the artificial neural networks, and wrote the corresponding sections. FL conducted together with FF preliminary work on theoretical error estimates solely based on the reduced order model, which yielded the theoretical insights for the computational expenses and the necessity for alternative approaches. FL wrote the corresponding section for the theoretical error estimation.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Vivid discussions within the scope of Cluster of Excellence SimTech (DFG EXC310 and EXC2075) regarding machine learning and data-driven model surrogation are highly appreciated. FF and MF further acknowledge the valuable discussions with Steffen Freitag (Ruhr-Universität-Bochum) on the topic of ANN-based regression and classification.

The Supplementary Material for this article can be found online at:

Supplementary material is provided in the form of three HDF5 datasets (containing all FEM and ROM results used for the ANN training). Further, the stress surrogate

^{2}elastoviscoplastic analysis of composite structures

^{1}For simplicity each evaluation launches a new Python instance, reloads the model from a file and returns the results to the FE code through another file.

^{2}The numbers for M2 and M3 are not representative as the final load was not achieved.