Bioinformatics Applied to Gene Expression

Despite many advances, human biology is still largely a black box. It is true that decades of intense study have given researchers a detailed understanding of particular characteristics—the phenotypes—of cells and tissues that have been altered by disease. And in recent years, dramatic progress in gene sequencing has made it possible for scientists to make correlations between these phenotypes and specific genes. But the complex network of interactions lying between the two is largely unknown. This is the realm of biomedical data science, the use of quantitative methods and large datasets to trace basic physiological processes.

open photo of Mete Civelek

Assistant Professor of Biomedical Engineering Mete Civelek uses big data analytics to understand the molecular pathways of disease and develops personalized medicine approaches to cardiovascular and metabolic disorders.

“Connecting the dots between genes and disease is incredibly important,” Assistant Professor of Biomedical Engineering Mete Civelek said. “If we can chart these pathways, we have a roadmap for developing therapies that can interrupt disease processes.”

Civelek and his lab at the University of Virginia combine a variety of statistical, computational and experimental methods to track the cascade of biological events that ultimately yields a particular disease phenotype.

In the case of diabetes, they used bioinformatic approaches to predict that a variant of a gene in a specific region of chromosome 7 decreases the production of a protein, KLF14. This protein, in turn, regulates the function of fat cells. People with low KLF14 levels have fewer fat cells and as result, the remaining fat cells are larger than usual. This leads to an increased risk of insulin resistance and, in women, a shift of fat storage from the hips to the abdomen, another risk factor. Civelek is tracking down the specific genes that are regulated by KLF14 and documenting the chain of events that produce changes in the adipose tissue.

He’s pursuing a similar investigation focusing on smooth muscle tissue collected during heart transplant. Changes in smooth muscle cells, which are found in the middle layer of artery walls, play a key role in the development of hypertension and coronary artery disease. Civelek and his colleagues are using computational methods to identify the genes that cause these changes, unravel the pathways that lead to atherosclerosis and the accumulation of plaque along artery walls and pinpoint regulators of this process that could be targets for new drugs.

“Once again, we are trying to connect the genetic variations to gene expression variations to protein variations to phenotypic variations,” he said. “We are trying to understand how information encoded in your DNA flows through this system to increase susceptibility to disease.”

Civelek’s investigations begin by linking a gene or area of the human genome to a disease. To be statistically meaningful, establishing this relationship requires information from large groups of people.

One dataset that Civelek uses for his work on type 2 diabetes is from the Metabolic Syndrome in Men study, which has collected data from more than 10,000 Finnish men. The researchers sequenced the genomes of these participants and took adipose tissue biopsies of 800 of them to measure their gene expression levels. In the process, researchers have identified a number of genes associated with diabetes.

To develop a well-rounded sample base that can be used for studying heart disease, Civelek faced a more difficult challenge: Researchers cannot biopsy coronary arteries in healthy living patients. To overcome this obstacle, he has collaborated with a team of heart transplant surgeons at the University of California, Los Angeles. Surgeons routinely trim the aorta of a donated heart to create a better fit with the patient’s own blood vessels. Instead of discarding this tissue, they save it for research. Civelek has samples from more than 150 donors.

“This is an incredibly valuable source of information,” Civelek said, “especially as it represents an ethnically diverse population.”