^{1}

^{*}

^{2}

^{1}

^{3}

^{1}

^{2}

^{3}

Edited by: Pedro Antonio Valdes-Sosa, Joint China-Cuba Laboratory for Frontier Research in Translational Neurotechnology, China

Reviewed by: Veena A. Nair, University of Wisconsin-Madison, United States; Bharat B. Biswal, New Jersey Institute of Technology, United States

*Correspondence: Nan Xu

This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Resting-state functional MRI (rs-fMRI) is widely used to noninvasively study human brain networks. Network functional connectivity is often estimated by calculating the timeseries correlation between blood-oxygen-level dependent (BOLD) signal from different regions of interest (ROIs). However, standard correlation cannot characterize the direction of information flow between regions. In this paper, we introduce and test a new concept, prediction correlation, to estimate effective connectivity in functional brain networks from rs-fMRI. In this approach, the correlation between two BOLD signals is replaced by a correlation between one BOLD signal and a prediction of this signal via a causal system driven by another BOLD signal. Three validations are described: (1) Prediction correlation performed well on simulated data where the ground truth was known, and outperformed four other methods. (2) On simulated data designed to display the “common driver” problem, prediction correlation did not introduce false connections between non-interacting driven ROIs. (3) On experimental data, prediction correlation recovered the previously identified network organization of human brain. Prediction correlation scales well to work with hundreds of ROIs, enabling it to assess whole brain interregional connectivity at the single subject level. These results provide an initial validation that prediction correlation can capture the direction of information flow and estimate the duration of extended temporal delays in information flow between regions of interest ROIs based on BOLD signal. This approach not only maintains the high sensitivity to network connectivity provided by the correlation analysis, but also performs well in the estimation of causal information flow in the brain.

Resting-state functional MRI (rs-fMRI) has been widely used to study the intrinsic functional architecture of the human brain based on spontaneous oscillations of the blood oxygen level dependent (BOLD) signals (Biswal et al.,

The BOLD signal is an indirect and sluggish measure of neuronal activity. Despite this, substantial insights have been gleaned by examining patterns of BOLD signals as proxies for functional connectivity in the brain, and these are consistent with more direct and invasive observations (Foster et al.,

Numerous methods for estimating functional or effective connectivity (Van Den Heuvel and Pol,

Methods for estimating functional connectivity can be oriented toward estimating a real number describing strength of connectivity, which might be quite small, vs. estimating a binary connectivity, which is present or absent, with possibly the addition of a strength of connectivity, in the form of a real number, for the case where a connection is present. Correlation and prediction correlation, which is a generalization of correlation that we propose in this paper, are methods that estimate a real number that describes strength of connection. Subsequent processing can then be applied to remove weak connections and/or organize the complete network into modular networks.

As is described in the following sections, testing on simulated rs-fMRI data with known ground-truth networks (Smith et al.,

In what follows, we describe a methodology for analyzing rs-fMRI data using a generalization of the well-established correlation approach, which is to correlate the timeseries at two ROIs. The generalization, denoted by “p-correlation” (“p” for “prediction”) is to replace correlation between the BOLD timeseries at two ROIs by correlation between the BOLD timeseries at one ROI and a prediction of this timeseries. The prediction is the output of a mathematical dynamical system that is driven by the timeseries at the other ROI. More generally, the prediction could be based on several, spatially discrete, ROIs. In this paper, we focus on the case where only one other ROI is used. We assume that the dynamical system is linear and has finite memory and that the memory duration and parameters may be estimated from the BOLD timeseries. If the prediction of the timeseries is restricted to use only the current value of the timeseries that drives the dynamical system, then p-correlation is the same as standard correlation. Therefore p-correlation is a generalization of correlation. Features of p-correlation include (1) the ability to indicate the directionality of the interaction between two ROIs, (due to the fact that this prediction correlation is asymmetrical between two signals), and (2) the ability to evaluate the interaction based on casual information.

In the remainder of this section, we describe the p-correlation approach in detail. Consider the ordered pair of ROIs (_{i} (_{j}) denote the rs-fMRI timeseries at the ^{th} (^{th}) ROI. Both timeseries have duration _{x}. The _{j} signal is predicted from the _{i} signal by a linear time-invariant causal dynamical model with _{i} as the input and the prediction _{j|i}, which is zero for negative times. We assume that the impulse response is of finite duration, with duration denoted by _{hj|i}. In summary,

The basic approach to estimate the coefficients of _{j|i} is to minimize the least squares cost

We estimate the value of _{hj|i} and the values of the impulse response at the same time by restating the least squares problem as a Gaussian maximum likelihood estimator (MLE) with a known variance for the measurement errors. The MLE allows a trade off of the accuracy of predicting the current data (i.e., minimizing _{hj|i}, with the accuracy of predicting when presented with new data, which is best done by smaller values of _{hj|i}. There are several approaches to quantifying this trade off including Akaike information criteria (AIC) (Akaike, _{hj|i} and _{x}:

See Equation S1 (Supplementary Material) for BIC.

Simultaneous minimization of Equation 3 with respect to both _{j|i}, which occurs only in the _{hj|i} determines the duration and the value of the impulse response. The integer minimization over _{hj|i} is computed by testing each value in a predetermined range of values, i.e., 1,2, …, _{hj|i}, the minimization with respect to _{j|i} involves only minimizing _{i} influences _{j} is separate from the dynamical system describing how _{j} influences _{i}, the approach described here can lead to a directed rather than undirected graph of interactions between ROIs.

Once _{j|i} and _{j|i} are estimated, the output of the dynamical system, which is the prediction _{j} and _{j|i}, can be computed. We use “correlation” and ρ_{j,i} for the standard approach (i.e., the standard correlation between _{j} and _{i}).

Let the total number of ROIs be denoted by _{ROI}. P-correlation is an asymmetric _{ROI} × _{ROI} matrix, where the asymmetry follows from ρ_{j|i} ≠ ρ_{i|j}. Furthermore, p-correlation includes lags of the _{i} signal since the dynamical system output at time _{i}[_{i}[_{i}[_{hj|i} + 1]. If _{hj|i} = 1 (i.e., no lags) and _{j|i}[0] ≥ 0 then ρ_{j|i} is the correlation between _{j} and _{i} so that ρ_{j|i} = ρ_{j,i} and the approach of this paper exactly reduces to the standard approach. In Section 2.2.1, we describe a constraint such that _{j|i}[0] ≥ 0 is always achieved. The p-correlation method does not depend upon the sampling rate (TR) which allows for collapsing across different scan sites or studies. The entire algorithm is shown in Figure

In Section 2.1 we defined p-correlation and described a practical method for its computation. The result is an asymmetric matrix of connection strengths for each subject. This fundamental method can be specialized for particular applications, often based on user's interests and what the user knows about the details of the applications. Several such specializations are described in the following paragraphs.

If the user has information on the type of interactions that are present, then this information can be used as a constraint on the least squares problem that determines the impulse response which is the basis of the prediction. For example, as in the simulated data of Smith et al. (_{j|i}[_{j|i}. Let _{j|i} be the covariance of _{j} and _{j|i} is related to the covariance of _{j}[_{i}[_{j,i}[_{j|i} is the numerator of ρ_{j|i}. Therefore, if all the lagged covariances are positive and we require the estimated values of _{j|i}[_{j|i} and for the p-correlation ρ_{j|i}. In the traditional functional connectivity analysis, when global signal regression is applied to rs-fMRI timeseries data, the valid inference of negative correlations cannot be made (Murphy et al.,

Three natural methods for thresholding ρ_{j|i} are described in this section.

Even with _{j|i}[_{j|i} values by zeros. One reason for seeking to have ρ_{j|i} non negative is mean signal regression in the preprocessing of the fMRI data which makes it difficult to interpret negative correlations. However, alternative preprocessing which omits mean signal regression (Jo et al.,

The previous paragraph concerned thresholding at value 0. Higher data-dependent minimum thresholds are often used for correlation and the same approach can be applied to p-correlaton. A standard approach (Power et al.,

In some problems the interactions are known to be unidirectional, e.g., in the simulated data of Smith (Smith et al.,

Each of the thresholding methods is a nonlinear operation applied to the matrix of ρ_{j|i} coefficients. Each can be applied to any matrix

The thresholding approach forms a _{ROI} × _{ROI} matrix of thresholded connection weights, from which the network is computed.

Some investigations, e.g., Smith et al. (_{k} (

There is a recent interest in estimating effective networks from multiple subjects while accommodating the heterogeneity of the group (Ryali et al.,

Information concerning groups of subjects could also be used in p-correlation. One approach would be to replace the _{j|i} in Equation 1 by

where

Simulated fMRI timeseries from the laboratory of S. M. Smith are documented (Smith et al.,

These synthetic fMRI timeseries were sampled every 3 s (TR = 3_{x} = 10 mins. All four simulations have 1% thermal noise and the hemodynamic response function (HRF) used in the second step has standard deviation of 0.5 s. The simulation is repeated for each of 50 subjects.

The algorithm is shown in Figure

As is described above, the integer minimization over the impulse function duration, _{hj|i}, is computed by testing from 1 second up to

Next, we consider the choice of threshold, _{j|i}, are given, the threshold value

^{2} = 25.

For the Smith simulated data, we have additional prior knowledge that the networks contain only unidirectional connections. Therefore, as is also done in Smith et al. (_{j|i}, which includes the unidirectional condition, with the ground truth network _{j|i}. The estimated network _{j|i} is the output of Equation 4c where the input is the thresholded network _{j|i}.

To compare the computed and ground truth networks, we define “accuracy,” denoted by

where 1{_{j|i} values that are almost certainly far from zero or exactly zero. Computing the accuracy

P-correlation and four alternative methods from Smith et al. (_{j}|_{i}) and _{i}|_{j}). P-correlation, Granger B1, Gen Synch S1 and LiNGAM all compute an asymmetric matrix filled with real-number connection weights, analogous to our _{j|i}. In all cases, the unidirectional prior knowledge is applied analogous to our transformation from _{j|i} to _{j|i}. For the Patel method implemented by Smith et al. (

In addition to the algorithms included in Smith et al. (

Comparing p-correlation with alternative methods of estimating effective connectivity, p-correlation provides a full asymmetric matrix for each subject independent of all other subjects, in which each entry, like correlation, predicts a connection strength between two ROIs. The ability to compute results based on an individual subject means that p-correlation can potentially be used in a clinical environment. This full asymmetric matrix of p-correlations can be thresholded as desired and/or further processed as desired using another algorithm, i.e., a graph analytic algorithm. In addition, p-correlation can process networks with hundreds of ROIs while GIMME is limited to 3–25 ROIs [Page 3 of GIMME Manual (Version 12)]. Furthermore, p-correlation estimates the temporal causal relation in the form of lagged impulse response in addition to the spatial causal relation between any pair of ROIs. In contrast, some alternative algorithms (e.g., IMaGES) estimate a sparse graph of interactions, and thus solve a somewhat different problem than the p-correlation method. Other algorithms have been developed as post-processing algorithms, which cannot detect connections, but only estimate direction if connections are detected by other methods, e.g., correlation. Among them, pairwise LiNGAM (Hyvärinen and Smith,

The methods described in this paper were implemented in Matlab software, which is available upon request, and were applied to four of Smith's fMRI simulations (Smith et al.,

The p-correlation method is based on estimation of a linear time-invariant causal dynamic model. The sample means of the duration of either constrained or unconstrained impulse responses are 3.34, 3.58, 3.64, and 3.76 s for the four simulations, respectively. By limiting the impulse response duration to 1 TR, it was verified that p-correlation with constraint on Least Squares is equivalent to the standard correlation as is described in Section 1. After thresholding the p-correlations computed with the nonnegative constraint on the coefficients of the linear system, an asymmetric matrix of connection weights _{j|i} for each subject was obtained.

The same specifications for processing of the simulated data, in particular, the same choice of the _{j|i} and _{j|i}, for Subject 14 of

_{j|i} (for ground truth) and _{j|i} (for constrained p-correlation), and quantities analogous to _{j|i} (for Granger B1, Gen Synch S1, LiNGAM, and Patel) for Subject 14 of

The mean and standard deviation of accuracy for each simulation, i.e., the average and square root of the sample variance of

Simulation | 1 | 2 | 3 | 4 |

# of ROIs | 5 | 10 | 15 | 50 |

# of COI pairs | 10 | 22 | 36 | 122 |

Granger B1 | 0.440 ± 0.206 | 0.295 ± 0.127 | 0.262 ± 0.088 | 0.130 ± 0.044 |

Gen Synch S1 | 0.472 ± 0.201 | 0.405 ± 0.139 | 0.379 ± 0.079 | 0.285 ± 0.056 |

LiNGAM | 0.372 ± 0.229 | 0.435 ± 0.177 | 0.301 ± 0.106 | 0.119 ± 0.037 |

Patel | 0.528 ± 0.193 | 0.491 ± 0.101 | 0.446 ± 0.099 | 0.366 ± 0.048 |

p-Corr (constrained) | 0.532 ± 0.192 | 0.502 ± 0.114 | 0.457 ± 0.126 | 0.405 ± 0.065 |

p-Corr (unconstrained) | 0.520 ± 0.218 | 0.467 ± 0.123 | 0.439 ± 0.109 | 0.371 ± 0.058 |

A “common driver” situation is the case where ROI 1 drives ROIs 2 and 3 but ROIs 2 and 3 do not directly interact. The challenge is to correctly detect the 1 → 2 and 1 → 3 connections without detecting 2 → 3 or 3 → 2 false connections. In order to focus exclusively on this situation, we have computed synthetic data from the three-ROI network shown in Figure

where _{3} (the 3 × 3 identity matrix). Zalesky et al. (_{x} = 1, 000. We consider only _{1} = _{2} = _{3} = 0.8 (so that all ROIs have the same intrinsic memory duration) and _{1} = _{2} = _{3} = 0.2 (so that all ROIs have the same intrinsic noise power, and the intrinsic noises are all independent). We consider the following cases: (1) no driving: _{21} = _{31} = 0, (2) weak driving: _{21} = _{31} = 0.1, (3) strong driving: _{21} = _{31} = 0.4, and (4) asymmetrical strong driving: _{21} = 0.4, and _{31} = 0.1.

Each simulation was repeated for 50 subjects. Let the maximum allowable duration of the impulse response be 3 samples. By using the specialization of p-correlation for Smith simulated data, as is described in Section 3.1.2, a directed graph _{j|i} is estimated by p-correlation (Figure _{j|i} with constrained least squares (Section 2.2.1) are 5.384e-04 ± 0.072. This number becomes 0.058 ± 0.043 when unconstrained least squares is applied. The smaller magnitude of the results using constrained least squares indicates that taking advantage of the prior knowledge that the weights are positive (i.e., _{1} = _{2} = _{3} = 0.8) provides improved performance in this case. In Cases (2) and (3), both the constrained and the unconstrained least squares achieve a 100% accuracy (Equation 6) for each subject. In the fourth case, the constrained or the unconstrained least squares gives an average of 0.800 ± 0.247 accuracy over all 50 subjects. We also tested _{x} = 200, 500, 5,000 for all four cases. Notice that as _{x} goes large, correlations become closer to the steady state and the accuracy computed by the p-correlation method increases as well.

In addition, p-correlation estimated the correct hierarchy on the three pairs of connection weights, which are consistent with “strong,” “weak,” and “non-” connections in the ground truth network. It also shows the correct direction of connections in a pair by a stronger weight. The constrained least squares (Section 2.2.1) provides a slightly superior result than the unconstrained approach. Specifically, larger numerical differences between the zero and nonzero entries, as well as between the asymmetric strong weights, were shown. On average across all 50 subjects, p-correlation used an impulse response duration of 1.007 samples for all four cases for both constrained and unconstrained approaches. In addition, in Case (3) (asymmetric strong weights), correlation mis-detected the connection between node 2 and 3, specifically the 2–3 correlation was the highest correlation value among the three pairs, whereas p-correlation, for both the constrained and unconstrained approaches, estimated this value as the lowest of the three pairs thereby avoiding the error in the correlation results.

While the tools described in this paper can be assembled into many algorithms, we use only one algorithm, which is shown in Figure _{ROI} = 264 spherical ROIs each with a 10mm diameter. We combine our p-correlation ideas with the widely-used (Power et al.,

As a function of the value of the threshold

In order to test the robustness of the p-correlation calculation, all 132 subjects were randomly divided into two equal cohorts, and each cohort was separately processed. The average of p-correlation connection strength ^{2} = 0.87) and is nearly a 45° diagonal line (

Standard correlation has been widely used to analyze functional connectivity from rs-fMRI timeseries between prespecified ROIs. Prior work has shown its high sensitivity for detecting the existence of network architectures under both simulated and experimental scenarios (Smith et al., ^{th} signal and an optimal linear time-invariant causal estimate of the ^{th} signal based on the ^{th} signal. In this way, it captures additional features concerning the interaction between two ROIs, specifically, the causality and directionality of the information flow on which the interaction depends. Based on the finite-memory linear time-invariant causal model, p-correlation allows the memory duration to be different in the two directions for one pair of ROIs and also to be different for different pairs of ROIs. In contrast, structural vector autoregressive models (Kim et al., ^{th} signal based on the ^{th} signal is restricted to use only the current value of the ^{th} signal, then p-correlation and standard correlation have the same magnitude.

Testing p-correlation on simulated fMRI data provided in Smith et al. (

Many approaches have been introduced to assess functional or effective connectivity of rs-fMRI data. Smith et al. (

Several versions of Granger causality analysis, based on multivariate vector autoregressive modeling, have been tested and performed poorly (Smith et al.,

Multivariate autoregressive processes (MVAR) have been successfully used in neuroscience outside of fMRI, e.g., in order to describe signals from EEG experiments (Ding et al.,

In addition to the algorithms used in Smith et al. (

The Smith et al. (

In order to focus on the challenges of a “common driver,” we have produced additional synthetic data for the three ROI network of Figure

We have applied p-correlation to experimental data from the 1,000 Functional Connectome Project (Biswal et al.,

Here we introduce a novel concept, the p-correlation, to estimate brain connectivity within well-characterized large-scale functional networks. The replication of previously observed network architectures in experimental data and the performance against the ground truth in simulated data, both suggest that the p-correlation approach may hold promise for future investigations of the brain's dynamic functional architecture.

NX and PD designed the algorithm to achieve the neuroscience goals of RS. NX wrote the software and performed the analysis. NX, RS, and PD prepared the manuscript.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

We are very grateful to Prof. S.M. Smith (University of Oxford) for providing simulation data and his software for applying Patel's conditional dependence measures and network measurements as described in his paper (Smith et al.,

The Supplementary Material for this article can be found online at: