Analys av gener och arter i metagenomikdata
Analys av gener och arter i metagenomikdata
Abstract
As new extreme high throughput DNA sequencing methods continue to develop the
large amounts of information that they give rise to in terms of metagenomic data opens
the way for completely novel approaches to the study of microbial ecosystems. In this
project we have investigated how relationships and connections between microbial genes
and species that carry them can be found based entirely on their appearance in two
separate datasets accumulated from the same samples using metagenomic analysis. The
dataset used in the study consists of abundance of genes and species derived from DNA
isolated from microbial communities in biofilms formed in seawater treated with different
concentrations of the antimicrobial agent triclosan. Genes and species that responded
in the same way to changes in the concentration of triclosan were grouped together for
further analysis. The method used for creating the different constellations consisted
of a first step where genes and species were clustered based on their abundance in the
samples. In the next step genes with strong correlations to each cluster of species and
species with strong correlations to each cluster of genes were identified. These constellations
based on species and genes were robust appearing not to vary with variations in
the parameters of the analysis and not to be dependent on whether the clustering was
based on associations of genes or vice versa. The constellations were also homogeneous
with respect to species and gene functionality (the same genes clustering with the same
species) which we interpret as meaning that the likelihood of a tangible connection between
them being identified is high. Clearly, concrete conclusions regarding the species
and the genes they carry cannot me made using the methods we present here, but several
interesting patterns have emerged that would bare further scrutiny. For example,
genes involved with the horizontal transfer of DNA between species do not appear to
survive in triclosan whereas genes associated with the bacterial immune system were
highly associated with bacteria that were able to establish themselves in the presence of
triclosan. Few analyses have been done in which information about the species present
in a studied niche or ecosystem and the genes that they collectively contain are combined
and there is much new information to be derived from such studies. Statistical
approaches to the analysis of species and their collective genome has the potential to
give new insights into previously unknown associations and to develop hypotheses that
can be further tested experimentally.
Degree
Student essay
Collections
View/ Open
Date
2019-06-14Author
Bäckström Lebens, Sofia
Eriksson, Emma