^{1}

^{2}

^{3}

^{3}

^{4}

^{5}

^{*}

^{1}

^{2}

^{3}

^{4}

^{5}

Edited by: Jian Kang, Emory University, USA

Reviewed by: Tingting Zhang, University of Virginia, USA; Linglong Kong, University of Alberta, Canada

*Correspondence: Dzung L. Pham

This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Regional analysis of volumes examined in normalized space (RAVENS) are transformation images used in the study of brain morphometry. In this paper, RAVENS images are analyzed using a longitudinal variant of voxel-based morphometry (VBM) and longitudinal functional principal component analysis (LFPCA) for high-dimensional images. We demonstrate that the latter overcomes the limitations of standard longitudinal VBM analyses, which does not separate registration errors from other longitudinal changes and baseline patterns. This is especially important in contexts where longitudinal changes are only a small fraction of the overall observed variability, which is typical in normal aging and many chronic diseases. Our simulation study shows that LFPCA effectively separates registration error from baseline and longitudinal signals of interest by decomposing RAVENS images measured at multiple visits into three components: a subject-specific imaging random intercept that quantifies the cross-sectional variability, a subject-specific imaging slope that quantifies the irreversible changes over multiple visits, and a subject-visit specific imaging deviation. We describe strategies to identify baseline/longitudinal variation and registration errors combined with covariates of interest. Our analysis suggests that specific regional brain atrophy and ventricular enlargement are associated with multiple sclerosis (MS) disease progression.

Magnetic resonance imaging (MRI) is commonly used in the study of brain structure. Many studies are based on measurements of tissue volumes within a number of predefined regions of interest (ROIs); for example, see Bartzokis et al. (

Voxel-based morphometry (VBM) is a complementary technique that measures local brain volumes in a normalized space and thus does not suffer from these limitations (Ashburner and Friston,

In many disease studies,

In practice, there are frequently cases that VBM does not find significant longitudinal trend. Possible causes are (1) the chosen statistical method is not sophisticated enough to extract longitudinal information; (2) a substantial amount of visit-to-visit variation to longitudinal signals exists; (3) heterogeneous longitudinal patterns exist within the diseases population.

The obvious solution to overcome such limitations is to combine the VBM analysis with more sophisticated statistical methods such as linear mixed models. However, for the first two cases, hypothesis driven VBM analyses cannot further exploit the data. In that case, figuring out the underlying structures of variation in the longitudinal data would be of interest. Further, we want to quantify the longitudinal and cross-sectional variability, and the association between each subject and their spatial patterns.

Thus, our main goal is to introduce a new statistical framework for longitudinal VBM analysis. To achieve the goal, we consider a data-driven analysis to provide a more complete statistical framework to analyze high-dimensional longitudinal brain images. A framework to allow for this conceptual partition of variability is longitudinal functional principal component analysis (LFPCA; Greven et al.,

In this paper we focus on LFPCA as a useful tool for longitudinal voxel-based analyses, particularly to quantify cross-sectional and longitudinal variability in the data. The simulation study illustrates the application of LFPCA to a simplified imaging setting. It demonstrates that LFPCA effectively separates longitudinal, cross sectional, and other variations. Notably, the simulation study shows that LFPCA can separate registration errors from baseline and longitudinal components of interest.

Forty eight MS patients (aged 42±12 years at baseline) were enrolled in a longitudinal study of brain volume change. The study population included 33 female and 16 male patients; 28 patients with relapsing-remitting MS (RRMS), 13 patients with secondary progressive MS (SPMS), 5 patients with primary progressive MS (PPMS) and 2 patients with clinically isolated syndrome (CIS). One hundred forty eight T1 images have been acquired, with three images per subject for 44 subjects and 4 images per subject for 3 subjects. The average time interval between scans was 368 days (±27). All images were spatially normalized via registration of T1 maps into the mean template, generated using Advanced Normalization Tools (Avants et al.,

High resolution 3D magnetization-prepared rapid acquisition of gradient echoes (MPRAGE; acquired resolution: 1.1 × 1.1 × 1.1 mm; TR:~10 ms; TE: 6 ms; TI = 835 ms; flip angle: 8°; SENSE factor:2; averages:1) were acquired on a 3.0 T MRI scanner (Intera, Philips Medical Systems).

In the processing, the follow-up images are affinely registered to their baselines via FMRIB's Linear Image Registration Tool (Jenkinson et al.,

In this section, we provide a description of the original LFPCA approach developed by Greven et al. (

Consider a longitudinal brain imaging study with subjects labeled by index _{ij} for _{i}. Each image is unfolded into a _{ij}(_{ij}(_{ij}) is a fixed main effect, and _{i, 0}(_{i, 1}(_{ij}(_{i}(_{ij}(

While this is a natural and relatively simple model for longitudinally observed data, the scale of the problem requires aggressive dimensionality reduction. LFPCA reduces dimensionality by projecting onto the subspaces which explain principal directions of variation in the data. In model (2.1), there are two sources of variation: subject-to-subject, captured by _{i}, and visit-to-visit within a subject, captured by _{ij} and the model assumption on _{i} and _{ij} in (2.1) allows us to partition the variation of the data and LFPCA models latent processes _{i} and _{ij} using a Karhunen-Loeve (K-L) expansion (Karhunen,

The K-L expansion decomposes the two latent processes as

LFPCA truncates K-L representations and represents observed data through a linear mixed-effects model:
_{1}, μ_{2}), variance _{k1} ≥ λ_{k2} if _{1} ≤ _{2}. Since _{i}(_{ij}(_{X} and _{W} are expected to be small in most applications.

For the unfolded vector, (2.2) can be rewritten as

In brain imaging data analysis, LFPCA can separate biological signals from non-biological artifacts. For example, registration errors due to structural differences between subjects can be captured by baseline subject-specific components _{W}. This will be illustrated via an extensive simulation experiment in Section 3.1.

The fixed effect η(_{ij}) can be estimated in a number of ways (Greven et al., _{ij}) is estimated by the sample mean

Zipunnikov et al. (

where δ_{i, j} = 1 if _{i, j} = 0 otherwise. Model (2.4) can be rewritten in terms of unfolded vectors _{ij1j2} ^{2} × ^{v}^{v} can be unbiasedly estimated by using ordinary least squares (OLS):

The covariance operators ^{X} and ^{W} are 2_{i} dimensional matrix

By multiplying with ^{⊤} to the left, we have

We estimate ^{⊤} in Equation (2.5) reduces the model to its low-dimensional form (2.6), without losing the original correlation structure of the data. Once inference is conducted in model (2.6), then quantities of interest from model (2.5) can be estimated by pre-multiplying Equation (2.6) by

Principal scores ξ_{i} and ζ_{ij} are estimated via Best Linear Unbiased Predictions (BLUPs) as follows. The stacked vector of _{Ji} is a _{i} × 1 vector of ones. Then the scores can be estimated as

The computed subject-specific principal component scores ξ_{i} are the derived composite scores computed for each linear trajectories based on the eigenvectors for subject-specific PCs. These scores can be used as predictors or outcomes in subsequent regression analyses to evaluate relationships between high-dimensional longitudinal trajectories and other variables of interest. Also, we can apply cluster analysis on the scores to uncover latent structure in the sample.

First, we applied traditional VBM analysis using a linear mixed model to find a longitudinal trend. Many previous longitudinal studies have applied pairwise comparisons between two time points (Driemeyer et al., ^{0}(^{1}(_{ij}(

In this section, we present a simulation study to test the performance of LFPCA in RAVENS-VBM analysis. We investigate if LFPCA can identify subject-specific signals from noise, particularly registration errors, which often dominate signals in VBM analyses. Also, we identify cross-sectional and longitudinal variation when they exist.

We design a simulation study to mimic longitudinal analysis of RAVENS images. For the purpose of illustration, we use 2D images with 200 × 200 = 40, 000 pixels. We generate images from 50 subjects (

Each image mimics four canonical brain structures: background (B), white matter (W), ventricles (V), and gray matter (G). Those four components are simplified and shown as a background, a big square, a small square inside the big square, and a rectangle at the bottom, respectively. Registration errors are introduced via random rigid shifts of simulated structures as described below.

In Figure

Figure _{X}). The baseline components (

_{X}

One useful feature of LFPCA is that contributions of the longitudinal and baseline components within each subject-specific component can be quantified on a [0, 1] scale. A subject-specific eigenvector is the stacked vector of baseline and longitudinal components:

An advantage of LFPCA is its ability to couple baseline and longitudinal variation. The longitudinal component is added to the baseline with the time used as a multiplicative weight. Figure

_{X}) at time 0 (baseline), 1, 2, and 3

To summarize, our simulation studies convincingly demonstrate the power and flexibility of LFPCA to address some of key challenges of brain imaging. In particular, LFPCA managed to estimate and separate longitudinal and cross-sectional variation in a complex imaging simulation design with registration errors. The main part of the analysis can be automated and performed robustly with no operator input. We also applied a classical VBM-linear mixed effect model for the simulated data. As we expected, the linear mixed effect model could identify linear trend in the ventricular area (V), but it did not find significant trend in other areas (W and G) due to low longitudinal changes in signal and high visit-to-visit variation.

In this section, we apply a standard VBM analysis to the MS cohort described in Section 2. This analysis focuses on the population mean of the longitudinal trend

^{a} |
^{b} |
^{c} |
|||||||
---|---|---|---|---|---|---|---|---|---|

GM | Atrophy | 112 | −6.34 | 139 | 86 | 67 | 139 | 90 | 68.9 |

79 | −6.36 | 118 | 158 | 35 | 116 | 158 | 35.0 | ||

40 | −5.84 | 122 | 166 | 41 | 121 | 166 | 41.0 | ||

28 | −6.13 | 136 | 148 | 80 | 136 | 149 | 79.8 | ||

15 | −5.36 | 143 | 167 | 89 | 143 | 166 | 89.3 | ||

VN | Enlargement | 111 | 5.59 | 161 | 151 | 70 | 157 | 152 | 72.4 |

WM | Enlargement | 154 | 6.30 | 118 | 150 | 76 | 117 | 150 | 75.6 |

100 | 5.75 | 127 | 111 | 85 | 128 | 111 | 84.8 | ||

Atrophy | 210 | −5.97 | 126 | 83 | 59 | 124 | 83.7 | 57.6 | |

157 | −5.77 | 106 | 98 | 83 | 107 | 98 | 83.5 | ||

31 | −5.46 | 118 | 154 | 37 | 118 | 154 | 36.7 |

We present the LFPCA results for ventricular RAVENS images in Figure

_{X}, (B) subject-visit-specific components Φ_{W}

The first subject-specific LFPC explains 45% of the overall variation, almost completely due to the cross-sectional part. The longitudinal part explains 81% of the variation within the second subject-specific LFPC. Figure

Most of the subject-specific LFPCs are driven by cross-sectional variation, which possibly include registration errors. The longitudinal changes are mainly captured by the second LFPC, which explains about 8% of the total variation. This provides an explanation as to why traditional VBM using linear mixed models did not find meaningful longitudinal patterns.

Figures

The significant correlation between the first subject-specific score and baseline VN volume (first row, fourth column) confirms that the first component represents baseline variation (^{2}: 0.9684), i.e., a subject with a positive score has larger ventricles at the baseline. The scores are significantly correlated to the subject's baseline age (^{2}: 0.1402) and three gray matter ROIs (thalamus, caudate, and putamen).

Figures ^{2}: 0.2371) and EDSS (^{2}: 0.2053) than the first component scores that represent cross-sectional variation. This indicates that the spatial patterns of longitudinal enlargement in ventricles are superior for modeling disease progression and age compared to simple ventricular volume measures.

We have applied a similar analysis to gray matter and white matter RAVENS images. Figure

As described above, LFPCA is a useful dimension reduction tool for high-dimenstional longitudinal data. In this section, we illustrated how the LFPC scores an be used in the correlation analyses. Further, LFPCA scores can be used as predictors or outcomes in regression analyses, classification or cluster analysis.

In this manuscript, we described and evaluated a coherent methodology for the study of longitudinal RAVENS—or other methodological—volumetric imaging studies. Our simulation studies demonstrate that LFPCA tightly links the analysis methodology with the morphometric image processing stream. We demonstrated that LFPCA can uncover interesting, yet subtle, directions of longitudinal variation in a case where independent voxel-level investigations fail. Of note, this study represents the first application of the high dimensional variation of LFPCA to voxel-based morphometric analysis. Related work includes Zipunnikov et al. (

A key insight from the simulation studies is the ability of LFPCA to uncover interesting directions of variation in the presence of errors from registration to a template. Previously, registration errors were handled via either extremely aggressive smoothing during post-registration processing or by improved normalization algorithms. While improved algorithms are certainly a desirable goal, all normalization algorithms must be tuned and suffer from trade offs (such as bias and variance). Our results suggest the possibility of employing less aggressive normalization.

The performance of LFPCA depends on the number of subjects, the number of time points, and time span over which data is collected. In designing imaging studies for LFPCA, having both a large number of subjects and a large number of visits may be challenging to obtain. Simulation studies we have conducted during the process of examining LFPCA showed that LFPCA performed well as long as we have either many subjects with smaller number of visits or smaller number of subjects with many visits. It is recommended to make the time span over which data is collected roughly similar across subjects, and long enough to observe longitudinal changes.

In our study of MS, we found that the majority of variation is focused in cross-sectional components. This will likely be true in any study of adults, as variation in head size, brain size and intracranial volume will vary far more substantially than longitudinal decline, not unlike if one were to study adult cross-sectional and longitudinal trends in heights. It would be of interest to apply LFPCA to developmental populations or populations with severe progressive brain disorders significantly after disease onset.

The correlation between subject-specific LFPC scores of ventricles and EDSS indicates that EDSS is better associated with longitudinal ventricular enlargement than baseline ventricular size. This implies ventricular enlargement is a sensitive measurement of disease progression. Some cross-sectional MS patient studies have reported that brain atrophy is related to irreversible clinical disability in (MS) and ventricular enlargement may be a sensitive marker of this tissue loss that is seen at all stages of MS (Turner et al.,

In the manuscript, we employed a registration strategy similar to Ashburner and Ridgway (

As demonstrated previously for longitudinal diffusion imaging analysis, and here for longitudinal voxel-based morphometry, LFPCA is a compelling alternative to linear mixed model analysis for exploring spatial patterns of anatomical variation within and across subjects. We emphasize that this approach is not limited to a specific brain modality. Besides neuroimaging, we look forward to seeing this method is applied to many other exciting studies including epigenetics. For example, genome-wide DNA methylation data collected at multiple time points could be analyzed to study mechanisms of epigenetic changes related to certain diseases (Martino et al.,

The LFPCA method described here is designed to model a linear trajectory over time. Given a relatively small number of visits (e.g., three visits on average) it is not feasible to model non-linear trends. However, if the data are collected over greater than 5 time points, the modeling of non-linear trajectories is possible. Currently, we are under a preliminary development of a method to extend LFPCA for non-linear trends modeled using spline functions. Further investigation on the numerical stability and performance will be conducted in the near future.

Research reported in this work was supported by National Institute of Health under award numbers R01NS070906, Z01NS003119, K01AG051348, R01HL12407 and R01NS060910. Support for this work included funding from the Department of Defense in the Center for Neuroscience and Regenerative Medicine.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.