Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - IMforFUTURE (Innovative training in methods for future data)

Teaser

IMforFUTURE (Innovative Training in Methods for Future Data) is an innovative multidisciplinary and intersectoral research training programme that addresses current shortcomings in omics research. We aim to open a new research horizon in integration of genetic, glycomic, and...

Summary

IMforFUTURE (Innovative Training in Methods for Future Data) is an innovative multidisciplinary and intersectoral research training programme that addresses current shortcomings in omics research. We aim to open a new research horizon in integration of genetic, glycomic, and epigenomic datasets into systems biology by developing innovative methods for the generation of omics measurements and for integrative analysis of omics data. Our multidisciplinary approach – from measuring data, via data analysis to interpretation or results – provides a unique training environment for our ESRs who will need to be able to act as bridge between several diverse disciplines to significantly contribute to the analysis of future datasets in relationship to human health.

IMforFUTURE provides access to unique cohorts for which we develop new techniques for high throughput and in depth glycomic and glycoproteomic data generation. In addition, we will measure epigenetic data to establish the relationship between glycomics and epigenetics. Glycomic and epigenetic datasets will be integrated with genetic, gene expression and microbiome datasets which are already available in these cohorts.

IMforFUTURE focuses on understanding the biological mechanisms underlying ageing, the biggest single risk factor for most diseases. For ageing IMforFUTURE provides access to omics datasets measured in centenarians; a Down Syndrome cohort will be used as a model for accelerated ageing. Other outcomes (bone mineral density, rheumatoid arthritis, chronic widespread pain etc.) are considered as well. For these datasets, our novel methods will address measurement error and the integration of multiple omics datasets. Ignoring measurement error may lead to biased parameter estimates or lack of statistical efficiency to detect relationships. Although it is evident that multiple datasets should be analyzed jointly, available methods for integrated analysis have strong limitations. Novel methods and their application are required for understanding of the biological mechanisms underlying human health.

Work performed

Training.
Our network comprises ESRs in chemistry, statistics, biology and epidemiology. To train our ESRs we have delivered a set of modules which covers the basics of relevant lab techniques, statistical and machine learning techniques, epidemiology and biology. The ESRs followed a set of modules in complementary skills and we developed a module for thinking interdisciplinary. Secondments to partner institutions are key in the development of ESRs’ skills needed to perform high impact research in a multidisciplinary environment and to become scientists able to bridge disciplines.

New glycomic and epigenetic datasets.
We have developed high throughput and in-depth methods to generate glycomic and glycoproteomics datasets. For high throughput measurements we focus on the glycosylation of the acute phase liver protein alpha-1 antitrypsin. Currently we are able to detect two out of three glycosylation sides. Further, we already delivered an algorithm to compute the derived traits from glycan data. Derived traits are likely more clinically relevant than the measured abundance of glycans. We have also developed a method for performing in depth N-glycoproteomic analysis. Epigenetic datasets are generated to investigate the regulation of the HNF1A gene and its functional role in protein glycosylation and to understand its role in autoimmune diseases such as type 2 diabetes.

Genetics of glycomics.
We have access to genetic markers and glycomics measurements of around 5000 subjects in five different cohorts. The glycomic datasets are, however, measured with two different technologies (called UPLC and LCMS) which are not fully compatible and need to be mapped to each other via correlation analysis. This work results in a large dataset with the potential to identify relevant genetic markers for glycomics.

New methods for measurement error and multiple omics.
Human microbiome data is subject to extra variation, due to missing covariates and measurement error. Ignoring such a variation is well known to yield biased parameter estimates. For modelling the extra variation we proposed negative binomial models with normally distributed random coefficients. For integration of multiple omics datasets, the orthogonal two-way partial least squares algorithm is available that decomposes each data set in a common, a data specific and a residual part. The common parts consist of variables which represent the same biological mechanisms in the multiple datasets. A sparse version of the algorithm was required to identify the important variables in this common part.

Ageing and chronic widespread pain (CWP).
Calendar and biological age differ and a central aim of IMforFUTURE is to define biological markers for ageing. Since glycans change with ageing, they can be used to represent biological age. A genome wide association study for biological ageing based on glycans (GlycoAge) has been performed. In addition we performed a genome wide association study for CWP.

Public Engagement.
All ESRs are currently planning public engagement activities in a primary school and at the University of Split. We will also produce a flyer about our project which will be distributed by our ESRs in Split. The materials of these activities will be used for analogous activities in other locations.

Final results

We are on track to deliver novel measurement techniques for data generation, datasets, methods for data analysis and results from data analyses for several human diseases and ageing. Finalizing our innovations and disseminating these results will be our work for the next period. New developments will be network methods for data integration, models for time to event analysis and multi omics data analysis for bone mineral density. The novel methods applied to the consortium dataset are expected to provide new insight on mechanisms underlying ageing. These will be presented and discussed at a workshop which will be organised by ESRs where world leaders in the field will be invited.

The single most important outcome will be a cohort of multi-disciplinary scientists with unique skills and a toolbox to bring omics research to the next level.

Website & more info

More info: https://imforfuture.eu.