How Big Data Analysis Is Changing Biomedical Science

| May 4, 2015

A study lasting several years has been led by researchers from the Simons Center for Data Analysis (CSDA) breaking new ground in learning about the establishment of how genes in 144 different human tissues and cell types work together, and carry out these functions. The original research paper was published by Nature Genetics on April 27th 2015.

Biomedical Science

Biomedical Science

Computational Analysis

The connection with computers comes through the use of computer science and statistical method analysis to combine to solve the puzzles of big data genomic collections. The research team worked under the leadership of Olga Troyanskaya, who is the deputy director for genomics at SCDA. The research team collected and integrated data from 38,000 genome experiments, from an estimated total of approximately 14,000 science publications. This data contained not only information about the cells ribonucleic acid (RNA) and protein functions, but also information from individuals who were diagnosed with a variety of diseases.

The researchers used integrative computational analysis to first isolate the functioning genetic interconnections contained in the datasets for the different tissue types. They then combined the specific tissue functional signals with each of the relevance disease’s DNA based genome wide association studies (GWAS) and this allowed the researchers to identify statistical associations between the genes and the diseases that would have otherwise gone undetected.

The technique that resulted was named as network guided association study (NetWAS) and integrates quantitative genetics with a functional genomics model to increase the power yielded from GWAS, and also identify the genes that were underlying some of the most complex human diseases. The technique is completely data driven and using NetWAS avoids a bias towards better studied gene research and pathways. This permits the discovery of associations in new and unique ways. This has all been achieved using deep biological insight with up to date computational methods, and then applying them to a large scale group of datasets.

Gene Networks Identified

The result from this combined computational research was 144 functional gene interaction networks identified. This included for organs as diverse as the liver, kidney, and the whole brain. The paper went on to describe the functional gene disruptions that take place for diseases such as diabetes, hypertension, and obesity.

The functional gene interaction networks could not be established in humans without big data capacities in human tissue. Many of the human cell types that are important to diseases cannot be studied through direct traditional experimentation methods, so being able to work with these critical datasets is crucial to the discovery of new cell type information. For example, the cells in the kidneys called podocyte cells perform the kidneys filtering function can’t be isolated for lab studies, neither can the function of genes be identified by genome scale experiments.

We do need to find a way to understand how these key proteins interact with each other in these cells if we are to understand and treat disease like chronic kidney disease. This new approach mined big data collections to build an accurate map of how the genetic circuits function together in the podocyte cells, and also the functions of cells in many other disease relevant organs, tissues and cell types.

Targeted Research

This research and its findings have important implications for a better understanding of normal gene functions, but they also help us improve our understanding of drug development and usage. Casual and target genes can be better identified for treatment, and previously unexpected drug interaction and disruption can be anticipated more accurately. Researchers from the biomedical field can use these networks and pathways that are uncovered to understand the action of certain drugs and the side effects in the context of specific disease relevant tissues. These discovered networks are useful for understanding how therapies of various type work and also to help develop new therapies.

The Future Of Biomedical Research

The researchers have adopted an open source model by creating an online resource for other scientists to also use the NetWAS system, and access the tissue specific networks. The team behind the research have built an interactive server called the Genome Scale Integrated Analysis of Networks in Tissues (GIANT). This allows users to explore the networks, and compare how genetic circuits can vary across different tissues, and then analyze the data from genetic studies to find genes that can also cause diseases.

Researchers who are studying Parkinson’s disease can use the server to to study the pathway of the brain is affected and identify new genes and pathways that are involved in the disease. Its a very exciting time in biomedical research, and has all been made possible through the computational processing power of computers, and their capabilities at analyzing big data.



About the Author: