Prediction is one of the big challenges in microbiome in research. In our lakes, we’d really like to be able to warn people when a cyanobacterial bloom is going to happen, and it’s not too far-fetched to imagine that one day, clinicians may look for early indicators of disease in the human microbiome. But even when we describe changes in environmental variables to the best of our abilities, we don’t get as much predictive power as you’d expect from that information. It appears that how the current microbial community is organized may contribute to where the community is headed.
Lab member Cristina came up with a metric called “cohesion” to measure connectedness within communities, which can be used to predict community change. Now, the idea of connectedness is not new – networks are frequently used to investigate interactions between microbes. However, there are a couple inherent issues with network analysis for microbial communities. One is the “hairball” effect – there are just too many taxa to analyze at once, and you end up with a tangled mess of potential interactions. The other is relative abundance. Sequencing normalizes all samples to roughly the same depth, which makes comparisons easier, but we can’t count the absolute numbers of each taxon in the original sample. For example, if the absolute abundances of three taxa A, B, and C were 100, 100, and 100, and you can sequence 30 reads, your sequence data may look like A = 10, B = 10, and C = 10. But if those taxa go up or down in absolute abundance – say A = 100, B = 200, and C = 100, or A = 50, B = 100, and C = 50, you would get the same result in relative abundance: A = 7.5, B = 15, and C = 7.5. You can see how this can be an issue if you’re trying to determine ecological causes of change!
Cristina’s cohesion metric calculates the correlations (both negative and positive) between all pairs of taxa across a set of samples and incorporates the abundance of each taxon to assign values for positive and negative cohesion to each sample. Because you only get two numbers per sample, this avoids the “hairball” problem of most connectivity analyses. Cristina also includes something called a “null model,” which tells you what you should expect to see if you calculate cohesion on relative abundance data with no true connectivity, so that you can compare the null model to your observed relative abundance data. And the results speak for themselves. Cristina’s negative cohesion metric predicts how quickly a community will change in Lake Mendota phytoplankton, explaining 46.5% of the variation. That may not seem like a lot, but Cristina shows that using the available 16 environmental variables combined only explains 22.9% of the variation – and cohesion requires no additional data!
The fact that the cohesion metric for negative relationships works as a predictive variable for community change shows that communities with more negative interactions (where one taxon increases in abundance while another decreases) are more likely to change in the near future. Combined with environmental variables, this increases our predictive power, and tells us something about how and why microbial communities change. To read more, check out Cristina’s full paper here.
-Alex