^{1}

^{2}

^{3}

^{1}

^{2}

^{4}

^{*}

^{1}

^{2}

^{3}

^{4}

Edited by: Nan Li, RIKEN, Japan

Reviewed by: Kelly Anne Barnes, Baylor College of Medicine, United States; Xiaobo Chen, Jiangsu University, China; Xiaosong He, Thomas Jefferson University, United States

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Functional brain networks derived from resting-state functional magnetic resonance imaging (rs-fMRI) have been widely used for Autism Spectrum Disorder (ASD) diagnosis. Typically, these networks are constructed by calculating functional connectivity (FC) between any pair of brain regions of interest (ROIs), i.e., using Pearson's correlation between rs-fMRI time series. However, this can only be called as a

Autism spectrum disorder (ASD) is a prevalent and highly heterogeneous childhood neurodevelopmental disease. It impairs children's social interaction, communication, and many other behavioral and cognitive functions in varying degrees (Ecker et al., ^{1}

Recently, resting-state functional magnetic resonance imaging (rs-fMRI), which uses blood-oxygenation-level-dependent (BOLD) signals as a neurophysiological index to probe brain activity, has been applied to the diagnosis of ASD (Plitt et al.,

To the best of our knowledge, very few studies have used high-order FC for ASD children diagnosis. We hypothesize that brain networks in ASD children could be altered due to miswiring during abnormal development. Such miswiring could affect both low-order FC and high-order FC. Similar to the hypothesis behind Alzheimer's disease studies using high-order FC (Chen et al.,

To explore these hypotheses, we extend our previous works on high-order FC by proposing high(er)-order brain functional network representations at multiple levels. We then use these multi-level FC networks (with different levels of functional interactions) for a joint and better ASD diagnosis. Furthermore, we devise a generalized, multi-level high(er)-order brain networks based classification framework, which includes an ensemble of multiple classifiers, each trained using a specific level of high(er)-order FC network to capture level-specific diagnostic information. We apply our new framework to the Autism Brain Imaging Data Exchange (ABIDE) database for individual-based classification between ASD children and normal controls (NC). Figure

Low-order FC network (

Multi-level high-order FC network construction. We construct the first-level of high-order FC network (represented by an “

LASSO-based feature selection. We treat the elements in the networks derived from Steps (1–2) as features for each subject. Then, LASSO algorithm (Tibshirani,

Ensemble classification. We construct an ensemble classifier with multiple linear SVM (support vector machine) classifiers (Cortes and Vapnik,

Overview of the proposed multi-level high-order functional connectivity classification framework for ASD diagnosis.

The main contribution of this paper is devising a multi-level higher-order FC representation strategy to capture the interactions among brain regions at multiple levels. As such, the features generated in different levels can contain supplementary information for joint classification.

The rs-fMRI dataset used in this study are obtained from the ABIDE database (Martino et al., ^{2}

The demographic information for ASD group and NC group.

ASD | 47/7 | 10.7 ± 2.28 | 109.41 ± 18.78 | 0.15 ± 0.07 |

NC | 40/6 | 11.22 ± 2.34 | 114.20 ± 12.73 | 0.14 ± 0.05 |

0.99^{a} |
0.27^{b} |
0.078^{b} |
0.36^{b} |

^{a}: Statistical significance level was calculated using the χ^{2}-test; p^{b}: Statistical significance level was computed using the two-tailed two-sample t-test.)

The subjects were scanned on a 3-Tesla Siemens Allegra scanner over 6 min, producing 180 time points at a repetition time of 2 s. In Table

The rs-fMRI acquisition parameters.

Siemens Magnetom (Allegra) | 3.0 × 3.0 × 4.0 (mm^{3}) |
90 (deg) | 2,000/15 (ms) | |

240 (mm) | 4.0 (mm) | 3906 (Hz/Px) | 33 |

For rs-fMRI data preprocessing, we used a widely adopted Data Processing Assistant for rs-fMRI (DPARSF) toolbox (Yan and Zang, ^{3}. Data scrubbing was further carried out to reduce the negative effect of head motion, and the volumes with FD larger than 0.5 mm were removed (Power et al.,

Because each FC network is represented as a fully-connected graph in a matrix format, we will mainly introduce how the corresponding matrices of the low-order and high-order FC networks are constructed in this section. Specifically, we first introduce how we derived the low-order FC network (

For each subject, we define

Then, a conventional correlation-based FC network (i.e., _{LON}, as defined below:

where each row or column of _{LON} denotes the Pearson correlation series between a specific ROI and all other ROIs. Each element in _{LON} is the Pearson correlation between the average time-series of a pair of ROIs _{LON} encodes low-order interactions between any pair of ROIs.

To fully capture high-order functional interactions across brain regions, we adopt a method proposed in (Zhang et al., _{i} = (_{i1}, _{i2}, … , _{iM}) denote a vector containing the correlations between the _{i} denotes the _{LON} in Equation 2. We compute the “correlation's correlation” between the

where _{i} = (_{i1}, … , _{i(i−1)}, _{i(i+1)}, … , _{i(j−1)}, _{i(j+1)}, … , _{iM}) and _{j} = (_{j1}, … , _{j(i−1)}, _{j(i+1)}, … , _{j(j−1)}, _{j(j+1)}, … , _{iM}). _{i}}), not just the original rs-fMRI time series _{i}. As a result, the correlation _{ij} in Equation 1 involves just the two different ROIs. In other words, the correlation coefficient _{HON−1} of the first-level of high-order FC network (

Furthermore, for a specific subject, we can obtain multi-level FC networks by their corresponding matrix series, i.e., {_{LON}, _{HON−1}, … , _{HON−t}}, in a subsequent level-by-level manner, in which each matrix _{HON−i} (_{HON−(i−1)}. In this way, higher-level connectivity features can be obtained from the low-level connectivity features, and thus form hierarchical representations of functional interactions across multiple brain regions.

For the

The feature vectors _{1}-norm regularized least squares regression, known as LASSO (Least Absolute Shrinkage and Selection Operator) (Tibshirani,

where λ is a parameter for controlling the strength of _{1}-norm regularization. The first term in Equation 5 is the empirical loss on the training data, and the second term is the _{1} − _{i} to be zero (i.e., corresponding to non-discriminative features in our classification task). In this way, we can jointly achieve classification error minimization and sparse feature selection. Let

After selecting the most important features by LASSO, we use SVM with a linear kernel for ASD classification (Cortes and Vapnik,

For evaluation, we tested our proposed method for classifying ASD and NC subjects. We also performed feature weight analysis to identify multi-level brain connections that are most discriminative for classifying ASD and NC.

For comparison, we used connectional brain features extracted from different orders of FC networks, including the matrix _{LON} from _{HON−1} from _{HON−2} from ^{3}

In this study, we adopted a 10-fold cross-validation strategy to evaluate the generalization performance of our proposed method. Basically, all training subjects were partitioned into 10 subsets (each subset with a roughly equal sample size), and each time the samples within one subset are selected as the testing dataset, while the remaining samples in the other 9 subsets are combined together as the training dataset for feature selection and classification. Finally, we report the average accuracy of classification results across all 10 cross-validation folds.

As the performance of our method depends on a few hyper-parameters, such as ^{−5}, ^{−4}, … , ^{5}], and

For comprehensive evaluations, we used six different statistical measures, namely classification accuracy (ACC), sensitivity or true positive rate (TPR), specificity or true negative rate (TNR), precision or positive predictive value (PPV), negative predictive value (NPV), and F1 score^{4}

To avoid biased results due to the fold selection, the entire 10-fold cross-validation process was further repeated 20 times, each with a different partition of subjects. The average statistics of the 20 repetitions were finally reported. Table _{LON} denotes the feature derived from the low-order FC networks (_{LON}+_{HON−1} denotes the combination of _{LON} + _{HON−1} and any other feature type are highlighted in bold.

ASD classification using different feature types.

1 | _{LON} |
0.73 | 0.75 | 0.70 | 0.74 | 0.72 | 0.75 |

2 | _{HON−1} |
0.70 | 0.73 | 0.67 | 0.70 | 0.70 | 0.71 |

3 | _{HON−2} |
0.67 | 0.74 | 0.64 | 0.65 | 0.74 | 0.69 |

4 | _{LON}+_{HON−1} |
||||||

5 | _{LON}+_{HON−2} |
0.76 | 0.77 | 0.75 | 0.80 | 0.72 | 0.78 |

6 | _{HON−1}+_{HON−2} |
0.72 | 0.77 | 0.67 | 0.69 | 0.76 | 0.73 |

7 | _{LON}+_{HON−1}+_{HON−2} |
0.78 | 0.81 | 0.75 | 0.78 | 0.79 |

Significance test between different pair of feature types.

0.044 | 0.037 | 0.047 | 0.049 | 0.040 | ||

0.042 | 0.025 | 0.042 | 0.024 | |||

0.034 | 0.046 | 0.03 | ||||

0.034 | 0.049 | |||||

0.045 |

As we can see from Table _{LON} + _{HON−1} method significantly outperforms all other methods. From the results shown in Table _{LON} and _{HON−1} achieves the best performance for all metrics, which might indicate that there exists more strongly complementary information between

In addition to the above ensemble learning for integrating low-order and high-order networks, we also evaluate another widely adopted strategy by firstly concatenating the features from different FC networks and then performing feature selection with LASSO and constructing a single linear SVM classification. The experimental results are shown in Table _{LON} ⊕ _{HON−1} denotes the concatenated features from the

Classification accuracy based on simple feature concatenation.

_{LON} ⊕ _{HON−1} |
0.79 | 0.81 | 0.77 | 0.80 | 0.77 | 0.80 |

_{LON} ⊕ _{HON−2} |
0.74 | 0.79 | 0.69 | 0.70 | 0.78 | 0.75 |

_{HON−1} ⊕ _{HON−2} |
0.72 | 0.74 | 0.70 | 0.74 | 0.70 | 0.74 |

_{LON} ⊕ _{HON−1} ⊕ _{HON−2} |
0.77 | 0.79 | 0.74 | 0.78 | 0.76 | 0.79 |

Based on the results of LASSO regression, we identified the most discriminative low-order and high-order functional features as those with the highest selection frequency across all 10-fold cross-validation runs. Note here we used the frequency of a feature to be selected in all cross-validation runs to reflect the contribution of the feature to the classification. Higher frequency indicates a larger contribution of the corresponding feature.

Figure

Connectogram and involved brain regions of the top 10 discriminative connections selected by our framework in

ROIs selected from

PreCG | Precentral gyrus | ||

SFGmed | Superior frontal gyrus (medial) | ||

REC | Rectus gyrus | ||

DCG | Middle cingulate gyrus | ||

TPOmid | Temporal pole (middle) | ||

VI-VER | Lobule VI of vermis | III-VER | Lobule III of vermis |

III-Cb | Lobule III of cerebellar hemisphere |

From the results shown in Figure

In this article, we proposed extracting multi-level high-order FC networks, derived from rs-fMRI, to capture the high-order correlation across different brain regions for ASD diagnosis. This is based on our hypothesis that different pairs of brain regions could influence each other, and their high-order correlations could contain more important discriminative information for ASD diagnosis, which is actually consistent with previous works, i.e., in Chen et al. (

Experimental results have shown that (1) high-order FC networks indeed include crucial discriminant information for ASD diagnosis, and (2) the combination of different order FC networks, especially

Lastly, it should be noted that we used a simple feature selection method, thus the selected features may still include redundant information, which could affect our classification accuracy. Accordingly, the strategies for discriminative feature selection and fusion need further investigation, which will be investigated in our future work. In addition, it should be noted that LASSO regression tends to select only one feature from multiple highly correlated features. In the context of diagnosis, this means that, although these features could be also essentially valuable for discrimination, they might be discarded after feature selection due to the multi-collinearity in the data matrix. In this work, we mainly followed the lead of previous studies (Jin et al.,

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

This work was supported in part by the National Natural Science Foundation of China (61773244, 61373079, 61672327, 61771230), the Provincial Natural Science Foundation of Shandong in China (ZR2015FL019, ZR2016FM40), Shandong Provincial Key Research and Development Program of China (2017CXGC0701), and the National Institutes of Health in USA (EB022880).

^{1}

^{2}

^{3}

^{4}