Algorithm Identifies Multiple Gene–Environment Relationships

Researchers at the German Cancer Research Center (DKFZ), EMBL's European Bioinformatics Institute (EMBL-EBI) and the Wellcome Sanger Institute have developed a new computational method that makes it possible to identify the impact of hundreds of environmental factors on genotype–environment interactions. The method will enhance understanding of relationship between genotype and environmental factors.

genotype and environmental factors
© Spencer Phillips, EMBL European Bioinformatics Insitiute

The scientists produced an algorithm and a bioinformatics method that can be applied to large cohorts of human genome and lifestyle data to identify the impact environmental factors (such as diet, physical activity or living conditions) have on genotype–phenotype relationships. Applying this method allows scientists to identify areas of the genome that affect human traits in different ways, depending on lifestyle or other environmental factors.

Hundreds of factors

While our genome is unchanged throughout our lifetime, human traits such as height or weight are influenced by lifestyle and environmental factors. “What we are doing in this study is going beyond the classical genome to phenotype approach by accounting for environmental factors in a comprehensive manner,” explains Oliver Stegle, a group leader at EMBLEBI and since summer 2018 head of the division Computational Genomics and System Genetics at the DKFZ. “Our approach allows us to simultaneously use hundreds of environmental factors, measured in human cohorts, to enhance the analysis of genotype–phenotype relationships. Previously such analyses required a narrow hypothesis, choosing a specific environmental factor such as physical activity, and testing for interactions with genetic variables to understand the impact on phenotypes. Now we can analyze everything in one go, meaning we can find and identify interplays between genomes, environment and phenotype in a comprehensive manner.”

By applying the new structured linear mixed model to body mass index in the UK Biobank (a database holding genome and lifestyle data for 500,000 people), the researchers were able to identify previously known and novel GxE signals that are simultaneously driven by multiple environmental factors.

“Characterizing gene–environment interactions using multiple environments is important,” says Paolo Casale, a postdoctoral researcher at Microsoft Research New England and an alumnus of EMBLEBI. “These analyses can provide a finer characterization of high-risk groups for certain diseases and help to identify the most relevant environmental factors.”
“This is an exciting new way of thinking about genotype–environment interactions, paving the way to explore the importance of these interactions rather than looking at genotype alone,” adds Rachel Moore, a PhD student at EMBL-EBI and the Wellcome Sanger Institute.

Giving genomes context

Going forward, this method will offer a more comprehensive way of incorporating environment into genetics studies than previously possible and will also increase the number of discoveries of variants whose function depends on environment or lifestyle.
Inês Barroso, senior group leader at the Wellcome Sanger Institute, adds: “We hope this method stimulates research which incorporates environmental factors, by taking our genome into context and generating understanding of how the function of the genome we are born with is modulated by our life, habits, environment and social interactions.”

Rachel Moore, Francesco Paolo Casale, Marc Jan Bonder, Danilo Horta, BIOS consortium, Lude Franke, Inês Barroso, Oliver Stegle: A linear mixed model approach to study multivariate gene-environment interactions. Nature Genetics 2018. DO:10.1038/s41588-018-0271-0

Source : Deutsches Krebsforschungszentrum, DKFZ