Abstract

In the TDApplied package vignette “TDApplied Theory and Practice” simulated data is used to provide examples of package function. In this vignette we will demonstrate that TDApplied can carry out meaningful analyses of real (i.e. non-simulated) data which other software packages cannot. We analyzed data from a very well-known neurological dataset, and identified topological features of neurological computation in single and multiple subjects which were correlated with (i) the task the subjects were performing while the data was collected, and (ii) the behavior (i.e. reaction time) of the subjects during the task – these correlations suggest that the features are meaningful in the context of an neuroimaging analysis. Moreover, these features were identified, interpreted and analyzed with the bootstrap_persistence_diagram, vr_graphs and diagram_kpca functions, only the first of which has any implementation in another R package (TDA). TDApplied is therefore a powerful tool for applied topological analyses of data.

Introduction

A popular technology for studying neural function is called functional magnetic resonance imaging (fMRI), in which oxygenated blood-flow across the brain is detected via magnetic resonance over multiple time points; fMRI is a proxy measurement of neural activity. Spatial activity patterns, i.e. vectors of measured values across space in a single time point, are modulated by performing tasks. The study design is the sequence of temporal blocks of performing these tasks and each task type is called a condition, and a study design can evoke meaningful information about neural processing related to tasks. Collections of spatial activity patterns (for example the spatial patterns evoked by a particular task over multiple time points) have previously been analyzed with topological and geometric techniques which capture their global structural features (J. Shine 2019; Saggar et al. 2018, 2021; X. Liu, Chang, and Duyn 2013). However, these analyses are not designed to capture the spatially periodic features of fMRI data which we would expect to exist in abundance (Greve et al. 2013; T. Liu 2016; Caballero-Gaudes and Reynolds 2017). One persistent homology analysis of fMRI spatial pattern data found robust 0-dimensional topological features (i.e. clusters of time points with similar spatial patterns) whose persistence values negatively correlated with fluid intelligence (Anderson et al. 2018). Larger differences between spatial patterns at different time points generally corresponded to lower values of fluid intelligence, but higher-dimensional topological features such as loops were not considered in that analysis.

In this exploratory analysis, using the R package TDApplied, we utilized persistent homology to find, in one subject’s fMRI data in an emotion task, task-related signal in the form of a spatial loop. We then linked the loop back to the subject’s raw data to interpret what neurological features the loop represented. Finally we showed that topological features in 100 subject’s emotion task fMRI data were correlated with behavior (i.e. the subject’s response time to certain task blocks). While neuroimaging researchers would not consider a single loop within one subject “real” or “significant” without finding a similar loop across multiple subjects, the task of optimally matching loops between datasets is an open problem, so we leave the problem of finding group-level spatial loops to future work. For our analysis we will use data from the famous Human Connectome Project (M. Glasser 2016b) which contains extensively preprocessed neuroimaging data from roughly 1200 subjects. We focused on the HCP emotion task data, which alternated between two conditions - deciding which of two faces matched another target face in emotion, and deciding which of two shapes matched another target shape. We only analyzed the right-to-left phase encoding scan (this is a parameter of MRI imaging which determines the ordering of when image slices of the brain are obtained) as this was the phase encoding direction for the specific subject loop we analyzed. Also, all fMRI data was projected onto surface nodes – points on a mesh of the brain’s surface geometry – which are more comparable across subjects than standard 3D volumes (M. Glasser 2016b). The script used to perform the analysis can be found in the exec directory of TDApplied. Our analysis demonstrates the potential of using TDApplied for deriving interpretable and otherwise obscured insights from real datasets.

A task-related spatial loop

For HCP subject 103111 we calculated a persistence diagram by analyzing the time-point by time-point correlation matrix [ρ_i, j] of spatial patterns. A distance matrix was computed by the transformation $\rho_{i,j} \rightarrow \sqrt{2(1 - \rho_{i,j})}$ (see the appendix for details), and the bootstrap procedure (Fasy et al. 2014) found a single significant loop. We plotted the VR graph (Zomorodian 2010) of the loop representative with ϵ being the birth value (with nodes labelled by their time points), and the VR graph of the whole dataset at this ϵ (with nodes in the representative cycle colored red):

The VR graphs of (left) just the representative cycle time points, and (right) all time points, with both epsilon scales at the loop birth value.

Two things are apparent from these plots – the first is that the most persistent loop is the first task block (based on the time points in that block), and the second is that the dataset mainly forms two major loops in a figure eight. The secondary loop was the second most persistent loop in the diagram, and was comprised of (almost) all other time points. We found that physiology (i.e. breathing, measured in the HCP dataset as the pressure exerted by the subject’s abdomen on the sensor belt, and averaged for each graph node) did not account for the two-loop structure, whereas task structure (measured by time-since-last-block – i.e. the length of time since the most recent task block started) did. In these graphs, red represents high values, pink middle-high, white middle, light blue middle-low, blue low and black missing.

VR graph of all time points, colored by (left) mean respiration and (right) time-since-last-block.

The secondary loop seemed task-related when we colored its VR graph by time-since-last-block (i.e. the length of time since the most recent task block started), with clusters of similarly-colored time points (i.e. nodes) and smooth gradients at various points around the loop:

VR graph of the secondary loop, colored by time-since-last-block.

Linking the secondary loop to raw data

We next linked the secondary loop back to the fMRI data from which it came. From the 2D layout of its VR graph we computed a θ variable (angle around the loop, between −π and π) and an r variable (distance to the origin). Using linear models of the form A_i = β₁cos (θ) + β₂sin (θ) + β₃ and A_i = β₄r + β₅ for the activity A of surface node i, we found that out of the total 91282 surface nodes, the activity of 1475 had a significant relationship with either cos (θ) or sin (θ), and the activity of 53 had a significant relationship with r (significance thresholding was done at the Bonferroni level of approximately 0.05/(2 * 91282) = 2.74 * 10⁻⁷). These two sets of nodes only shared one common node.

Here are surface plots of the nodes for the left hemisphere, generated using python, responding to θ and r:

Surface nodes whose activity was significantly correlated with (left) theta and (right) r.

A large cluster of surface nodes responding to θ was found in the left hemisphere (and not in the right hemisphere), which appeared to belong to the brain region Pol1 in the insular cortex. Interestingly, Pol1 was not found to show significant differences in activity between the faces and shapes conditions (M. Glasser 2016a), despite the implication of the insular cortex in emotional processing (Gogolla 2017). This finding suggests that spatial loops of fMRI data may contain complementary information to typical task-based analyses of fMRI data.

A t-test then found a significant difference in mean r values between shape and face blocks:

Boxplot of r values in shape and face blocks.

Linking topology to behavior

The main goal of using a summary statistic of neurological data, such as persistence diagrams of spatial activity patterns, is to correlate the statistic with behavior. In the HCP dataset, mean reaction time (in milliseconds) was recorded for all subjects and all tasks, so we checked whether topology was predictive of reaction time for the emotion task. We selected 100 subjects at random, computed their emotion right-to-left phase encoding data persistence diagrams, and computed a one dimensional kernel PCA embedding. We then plotted the relationship between the first embedding dimension and reaction time to shape blocks, which had a correlation of about 0.31:

A 2D scatterplot, whose x-axis is the 1D PCA embedding coordinates of the 100 subject’s emotion persistence diagrams and whose y-axis is the 100 subject’s mean response times in the shape blocks trials.

We then computed statistical significance by Fisher-transforming the sample correlation of 0.31 to obtain a test-statistic of about 0.32, and compared this statistic to a normal null distribution with mean 0 and standard error $1/\sqrt{100-3} \approx 0.1$ (Fisher et al. 1936). The resulting p-value, for a two-sided null hypothesis, would have been less than 0.002, suggesting a significant positive correlation between the topology of neural activity spatial patterns and subject behavior.

Conclusion

We analyzed loops of spatial pattern correlations in HCP emotion fMRI data using TDApplied, and were able to visualize and interpret significant loops of an individual subject, relating them to the study design. Such a result implies a complex structure for the representation of the conditions, a complexity that cannot be captured by linear models assuming clustered distributions. Moreover, our topology embedding, enabled with this software and approach, allowed us to discover a relationship between fMRI patterns and reaction time – effectively, capturing a complex brain-behavior relationship. These results point towards the potential usage of 1-dimensional topology in finding task and/or behavior relationships with fMRI or other neuroimaging modalities, again implying that the shape of neuroimaging data may not be completely captured by traditional analysis methods. Moreover, our analysis hinged on the usage of several functions which are (largely) unique to TDApplied. These results show that TDApplied is the most flexible and useful tool for carrying out meaningful analyses of data.

Appendix: Converting Correlations to Distances

In our analysis of HCP data we converted correlation values to correlation distance values using the formula $\rho \rightarrow \sqrt{2*(1-\rho)}$. To see why this transformation does produce distance values, let X and Y be two vectors (of the same length n) with 0 mean and unit variance. Then $\rho(X,Y) = \frac{Cov(E,Y)}{\sigma_{x}\sigma_{y}} = \frac{E[(X-0)(Y-0)]}{(1)(1)} = E[XY] = \frac{\sum_{i = 1}^{n}X_i Y_i}{n}$. The Euclidean distance of X and Y is therefore $d(X,Y) = \sqrt{\sum_{i=1}^{n}(X_i-Y_i)^2} = \sqrt{\sum_{i=1}^{n}X_i^2 + \sum_{i=1}^{n}Y_i^2 -2\sum_{i=1}^{n}X_i Y_i} = \sqrt{n + n - 2\rho(X,Y)} = \sqrt{2n(1-\rho(X,Y))} \propto \sqrt{2(1-\rho(X,Y))}$. Therefore, since a scaled distance metric (by a positive number like $\sqrt{n}$) is also a distance metric (this is very easy to verify), the transformation $\rho \rightarrow \sqrt{2*(1-\rho)}$ indeed gives distance values. Note that in (Anderson et al. 2018) a proportional transformation was used $\rho \rightarrow \sqrt{1-\rho} \propto \sqrt{2(1-\rho)}$ and therefore our results are consistent with those found in that work.

References

Anderson, Keri L., Jeffrey S. Anderson, Sourabh Palande, and Bei Wang. 2018. “Topological Data Analysis of Functional MRI Connectivity in Time and Space Domains.” In Connectomics in NeuroImaging, edited by Guorong Wu, Islem Rekik, Markus D. Schirmer, Ai Wern Chung, and Brent Munsell, 67–77. Cham: Springer International Publishing.

Caballero-Gaudes, Cesar, and Richard C. Reynolds. 2017. “Methods for Cleaning the BOLD fMRI Signal.” NeuroImage 154: 128–49. https://doi.org/https://doi.org/10.1016/j.neuroimage.2016.12.018.

Fasy, Brittany, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, and Aarti Singh. 2014. “Confidence Sets for Persistence Diagrams.” The Annals of Statistics 42 (March): 2301–39.

Fisher, Ronald Aylmer et al. 1936. “Statistical Methods for Research Workers.” Statistical Methods for Research Workers., no. 6th Ed.

Gogolla, Nadine. 2017. “The Insular Cortex.” Current Biology 27 (12): R580–86. https://doi.org/https://doi.org/10.1016/j.cub.2017.05.010.

Greve, D., G. Brown, B. Mueller, G. Glover, T. Liu, and Function Biomedical Research Network. 2013. “A Survey of the Sources of Noise in fMRI.” Psychometrika 78: 396–416.

J. Shine, et al. 2019. “Human Cognition Involves the Dynamic Integration of Neural Activity and Neuromodulatory Systems.” Nature Neuroscience 22.

Liu, T. 2016. “Noise Contributions to the fMRI Signal: An Overview.” NeuroImage 143: 141–51.

Liu, Xiao, Catie Chang, and Jeff Duyn. 2013. “Decomposition of Spontaneous Brain Activity into Distinct fMRI Co-Activation Patterns.” Frontiers in Systems Neuroscience 7: 101. https://doi.org/10.3389/fnsys.2013.00101.

M. Glasser, et al. 2016a. “A Multi-Modal Parcellation of Human Cerebral Cortex.” Nature 536: 171–78.

———. 2016b. “The Human Connectome Project’s Neuroimaging Approach.” Nature Neuroscience 19: 1175–87.

Saggar, Manish, James M. Shine, Raphaël Liégeois, Nico U. F. Dosenbach, and Damien Fair. 2021. “Precision Dynamical Mapping Using Topological Data Analysis Reveals a Unique Hub-Like Transition State at Rest.” bioRxiv. https://doi.org/10.1101/2021.08.05.455149.

Saggar, Manish, Olaf Sporns, Javier Gonzalez-Castillo, Peter A. Bandettini, Gunnar E. Carlsson, Gary H. Glover, and Allan L. Reiss. 2018. “Towards a New Approach to Reveal Dynamical Organization of the Brain Using Topological Data Analysis.” Nature Communications 9.

Zomorodian, Afra. 2010. “The Tidy Set: A Minimal Simplicial Set for Computing Homology of Clique Complexes.” In Proceedings of the Twenty-Sixth Annual Symposium on Computational Geometry, 257–66. SoCG ’10. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/1810959.1811004.

Human Connectome Project Analysis