LO7: Health bioinformatics
Bioinformatics includes the integration of computers, software tools, and databases in an effort to address biological questions. Bioinformatics approaches are often used for major initiatives that generate large data sets. Two important large-scale activities that use bioinformatics are genomics and proteomics. Genomics refers to the analysis of genomes. A genome can be thought of as the complete set of DNA sequences that codes for the hereditary material that is passed on from generation to generation. These DNA sequences include all of the genes (the functional and physical unit of heredity passed from parent to offspring) and transcripts (the RNA copies that are the initial step in decoding the genetic information) included within the genome. Thus, genomics refers to the sequencing and analysis of all of these genomic entities, including genes and transcripts, in an organism. Proteomics, on the other hand, refers to the analysis of the complete set of proteins or proteome. In addition to genomics and proteomics, there are many more areas of biology where bioinformatics is being applied (i.e., metabolomics, transcriptomics). Each of these important areas in bioinformatics aims to understand complex biological systems.
Many scientists today refer to the next wave in bioinformatics as systems biology, an approach to tackle new and complex biological questions. Systems biology involves the integration of genomics, proteomics, and bioinformatics information to create a whole system view of a biological entity.
For instance, how a signaling pathway works in a cell can be addressed through systems biology. The genes involved in the pathway, how they interact, and how modifications change the outcomes downstream, can all be modeled using systems biology. Any system where the information can be represented digitally offers a potential application for bioinformatics. Thus, bioinformatics can be applied from single cells to whole ecosystems. By understanding the complete “parts lists” in a genome, scientists are gaining a better understanding of complex biological systems. Understanding the interactions that occur between all of these parts in a genome or proteome represents the next level of complexity in the system. Through these approaches, bioinformatics has the potential to offer key insights into our understanding and modeling of how specific human diseases or healthy states manifest themselves.
Translational bioinformatics
Translational bioinformatics, a field in the study of health informatics that emerged after the first human genome mapping, focuses on the convergence of molecular bioinformatics, biostatistics, statistical genetics and clinical informatics. The field is evolving at a tremendously fast pace, and many related areas have been proposed. Amongst them, pharmacogenomics is a branch of genomics concerned with individuals’ variations to drug response due to genetic differences. The area is important for designing precision medicine in future. Though a relatively young field, translational bioinformatics has become an important discipline in the era of personalized and precision medicine.
Figure 1. Translational Bioinformatics.
A 2014 review article categorized recent themes in the field of TBI into four major categorizations:
- clinical ‘‘big data”, or the use of electronic health record (EHR) data for discovery (genomic and otherwise);
- genomics and pharmacogenomics in routine clinical care;
- omics for drug discovery and repurposing; and
- personal genomic testing, including a number of ethical, legal, and social issues that arise from such services.
The importance of translational bioinformatics may be best understood in the things it is teaching us, things not previously knowable. For example, it is identifying flawed science, improving estimates of relative pathogenicity of human genetic variants, inferring new insights about underlying genetic mechanisms of disease, and identifying promising new drug indications based on curating large volumes of scientific literature. While, sequencing an exome for a clinical diagnosis can be a routine task, the interpretation of the data to make an actual diagnosis or treatment plan is much more complex. Out of the many thousands of variants identified, many of them will have to be evaluated for their clinical utility. At times, for perhaps a simple Mendelian disorder this may be as simple, as only a single variant will need to be identified and considered. But for more complex diseases (e.g. cancers, diabetes, or neurodegenerative diseases) multiple variants will need to be identified. It is only by asking the correct questions about the patient and the disease, along with employing the right computational tools that correct answers can be achieved.
New discoveries, resulting from the Human Genome Project, are now frequently applied to develop improved diagnostics, prognostics, and therapies for complex diseases, which is known as “translational genomics”. In particular, the sequencing cost per genome has markedly reduced over the last decade, according to the data presented by the National Institutes of Health (NIH) Human Genome Research Institute as shown in Figure 2. This further gives rise to new opportunities for personalized treatment and risk stratification.
Figure 2. a) Number of research studies sequencing DNA or genomes (source: PubMed, Web of Science, Scopus, IEEE, ACM). b) Sequencing cost per human-sized genome (source: National Human Genome Research Institute, NHGRI). Total volume of genomic data per year reported by completed studies for c) eukaryotes and d) prokaryotes in 1e2 GB (source: National Center for Biotechnology Information) (Andreu-Perez, Poon, et al. 2015).
On the other hand, research in bioinformatics has broadened from solely sequencing the genome of an individual to also measuring epigenomic data (i.e., above the genome), which include processes that alter gene expression other than changes of primary DNA sequences, such as DNA methylation and histone modifications. Information technologies for acquiring and analyzing biological molecules other than the genome, for example, transcriptome (the total mRNA in a cell or organism), proteome (the set of all expressed proteins in a cell, tissue, or organism), and metabolome (the total quantitative collection of low molecular weight compounds, metabolites, present in a cell or organism that participate in metabolic reactions) are also needed for future advances in the field. To summarize, OMICS aims at collectively characterizing and quantifying groups of biological molecules that translate into the structure, function, and dynamics of an organism. The OMICS profile of each individual should eventually be linked up with phenotypes obtained from clinical observations, medical images, and physiological signals (see Figure 3).
Figure 3. Outline of the “OMICS” approach for studying disease mechanisms. OMICS aims at collectively characterizing and quantifying groups of biological molecules that translate into the structure, function, and dynamics of an organism. The OMICS profile of each individual, including the genome, transcriptome, proteome, and metabolome, should be eventually linked up with phenotypes obtained from clinical observations, medical images, and physiological signals. Different acquisition technologies are required to collect data at each biological level. Interaction within each level and across different levels as well as with the environment, including nutrition, food, drugs, traditional Chinese medicine, and gut microbiome presents grand challenges in future bioinformatics research.
Figure 4. Practical model for the design and execution of translational informatics projects, illustrating major phases and exemplary input or output resources and data sets (Payne et al. 2009).
Genomics in clinical care (Translational Genomics)
While genetics focuses on DNA coding for single functional genes, genomics is the study of the entirety of our DNA, recognizing the crucial regulatory role of non-coding DNA and the complex interactions between multiple genes and the environment. Genomics is fundamental to precision medicine which, through its four components of predictive, preventive, personalized, and participatory medicine, aims to promote wellness as well as to more precisely treat disease. Currently, there is a great amount of genomic discovery research occurring that includes new genomic variants, biomarkers and other basic science discoveries. Thus, many foresee rapid advances in genetic testing and genome sequencing over the next decade, with inevitable implementation into clinical practice.
GPs will play an important role within a genomics medicine service both in supporting patients through diagnostic and treatment processes and in using knowledge of genomics for disease prevention. Also, decreasing costs and increased availability of genetic testing and genome sequencing mean many physicians will consider using these services over the next few years, with some projecting that sequencing will become fully integrated into standard medical care within 10 years.
A tumour’s genomic signature may be used to make a precise diagnosis, enabling more accurate prognosis and better tailored treatment. Examples include Herceptin® (trastuzumab) in breast cancer treatment and BRAF inhibitors in malignant melanoma. Treatment can also be based on germline genomic information; PARP inhibitors are more efficacious in the treatment of ovarian cancer in individuals who carry a BRCA gene mutation.
Although comprehensive genotyping is still relatively recent, it has a high potential for genetic stratification in patient screening, for instance, in the case of factors arising from genotyping, such as high-risk DNA mutations, milk and gluten intolerance, and muscovisciosis. Genetics combined with phenotypic information provided by EHR may help to provide greater insights into low penetrant alleles. For example, it is well known that mutations of fibrillin 1 (FBN1) cause MFS. Nevertheless, the etiology of the disease leads to marked clinical variability of MFS patients of the same family as well as different families. Combining genetic tests of FBN1 and a series of related genes (TGFBR1, TGFBR2, TGFB2, MYH11, MYLK1, SMAD3, and ACTA2) will help to screen out patients who are more likely to develop aortic aneurysms that lead to dissections. Further studies on these high-risk patients based on morphological images of the aorta may provide insight into the rate of disease development.
Another potential area for translational genomics is to study the gene networks of different syndromes of the same person in order to better understand how these syndromes are interrelated. For example, this has been used to study different genes on chromosome 21 (HSA21) and their role in Down’s Syndrome (DS), as well as to understand the underlying reason why nearly half of DS patients exhibit an overprotection against cardiac abnormalities related to the connective tissue. One hypothesis is based on the recent evidence that there is an overall upregulation of FBN1 in DS (which is normally down regulated in MFS). The construction of genetic networks will, therefore, provide a clearer picture of how these syndromes are related. By understanding the gene networks of the related syndromes, it may be possible to provide specific gene therapy for the related diseases.
Another example took place at Stanford’s Lucile Packard Children’s Hospital, where a newborn presented with a condition known as long QT syndrome. In this specific case, the manifestation was unusually severe-the baby’s heart stopped multiple times in the hours after its birth. Long QT syndrome can be caused by mutations in a number of different genes. It is necessary to know which gene harbors the mutation in order to know how to treat the condition. In this case, a whole-genome sequencing (WGS) was performed enabling identification of a previously-studied mutation, as well as a novel copy number variation in the TTN gene that would not otherwise have been detectable through targeted genotyping alone. Moreover, NGS enabled the answer to be obtained in a matter of hours to days instead of weeks.
Pharmacogenomics
Pharmacogenomics can be defined as the study of how genetic factors affect a person’s response to drugs. This relatively new field combines pharmacology (the science of drugs) and genomics (the study of genes and their functions) to develop effective, safe medications and doses that will be tailored to a person’s genetic makeup.
Many drugs that are currently available are “one size fits all,” but they don't work the same way for everyone. It can be difficult to predict who will benefit from a medication, who will not respond at all, and who will experience negative side effects (called adverse drug reactions). Adverse drug reactions are a significant cause of hospitalizations and deaths. Once a patient takes a drug, the drug must travel through the body to its target(s), act on its target(s), and then leave the body. The first and last of these processes is facilitated by pharmacokinetic (PK) genes, which may affect a drug in the ‘‘ADME’’ processes: to be absorbed into and distributed through the body, metabolized (either to an active form or broken down into an inactive form), and excreted. With the knowledge gained from the Human Genome Project, researchers are learning how inherited differences in genes affect the body’s response to medications. These genetic differences will be used to predict whether a medication will be effective for a particular person and to help prevent adverse drug reactions.
Pharmacogenomics focuses on the identification of genome variants that influence drug effects, typically via alterations in a drug’s pharmacokinetics or via modulation of a drug’s pharmacodynamics (e.g., modifying a drug’s target or perturbing biological pathways that alter sensitivity to the drug’s pharmacological effects). For diseases other than cancer and infectious diseases, the genome variations of interest are primarily in the germline DNA, either inherited from parents or de novo germline sequence changes that alter the function of gene products. In cancer, both inherited genome variations and somatically acquired genome variants can influence response to anticancer agents.
Whole genome sequencing by NGS is important to the study of complex diseases such as cancer. It has been a long-standing problem in cancer treatment that drugs often have heterogeneous treatment responses even for the same type of cancer, and some drugs only show profound sensitivity in a small number of patients. Currently, large-scale personal genomics and pharmacogenomics datasets have been generated to uncover unique signaling patterns of individual patients and discover drugs that target these unique patterns. These include cancer cell line databases of nonspecific cancer cell types or a specific cancer cell type such as breast cancer. The Cancer Genome Atlas Project of the NIH has tested the personal genomic profiles of over 10000 individuals across over 20 types of cancer and uncovered new cancer subtypes based on those profiles. Patients with distinct genomics aberrations are believed to be responsible for the variability of drug response. Large-scale datasets as such can be used to enable drug repositioning, predict drug combinations, and delineate mechanisms of action. They are becoming an important component in drug development. It is, therefore, possible to design precision medicine for individual patients based on their genomics profiles.
Pharmacogenomics has gone beyond studying individuals’ drug response based on genome characteristics (e.g., copy number variations and somatic mutations) and now incorporates additional transcriptomic and metabolic features such as gene expression, considering factors that influence the concentration of a drug reaching its targets and factors associated with the drug targets. Since the gene expression profiles of cell lines are known to vary considerably in the process of prolonged culture under different culture conditions and techniques, the use of gene expression from cell lines for prediction of drug response in the patient is currently controversial. A recent algorithm for predicting in vivo drug response with the patient’s baseline gene expression profile achieved 60%– 80% predictive accuracy for different cases. Other research studied drug response using immunodeficient mice xenografted with human tumors, which have the advantage of potentially studying both genetic and nongenetic factors that affect cancer growth and therapy tolerance.
The field of pharmacogenomics is still in its infancy. Its use is currently quite limited, but new approaches are under study in clinical trials. In the future, pharmacogenomics will allow the development of tailored drugs to treat a wide range of health problems, including cardiovascular disease, Alzheimer disease, cancer, HIV/AIDS, and asthma.
Omics for drugs discovery and repurposing
The cost of generating new therapeutics has risen dramatically over the past 60 years, with each new drug costing about 80-fold more in 2010 than 1960 in inflation-adjusted terms. Also, much has been said about the protracted process involved in getting a drug through the FDA approval pipeline. Estimates are that the process can take on average 12 years between lead identification and FDA approval. As a result, many are investigating high-throughput and computational approaches to drug discovery and repurposing. Recent efforts have focused on the use of the omics data, especially genomics, to discover new drug targets and search for new uses for existing drugs, referred to as drug repositioning.
Pharmacogenomics can impact how the pharmaceutical industry develops drugs, as early as the drug discovery process itself (Figure 5). First, cheminformatics and pathway analysis can aid in the discovery of suitable gene targets, followed by small molecules as ‘‘leads’’ for potential drugs. Additionally, discovery of pharmacogenomic variants for the design of clinical trials can allow for safer, more successful passage of drugs through the pharmaceutical pipeline. As mentioned previously, cheminformatics methods can be used to identify novel drug-protein interactions. While these predicted interactions can be used to discover new small molecules for therapeutic purposes, any new drug must still go through the significant regulatory hurdles of safety and efficacy testing.
Figure 5. Drug discovery. Pharmacogenomics can be used at multiple steps along the drug discovery pipeline to minimize costs, as well as increase throughput and safety. First, association and expression methods can be used to identify potential gene targets for a given disease. Cheminformatics can then be used to narrow the number of targets to be tested biochemically, as well as identifying potential polypharmacological factors that could contribute to adverse events. After initials, pharmacogenomics can identify variants that may potentially affect dosing and efficacy. This information can then be used in designing a larger Phase III clinical trial, excluding ‘‘non-responding’’ and targeting the drug towards those more likely to respond favorably.
In addition to the Human Genome Project, several large-scale biological databases launched recently will further facilitate the study of disease mechanisms and progressions, particularly at the system level as outlined in Figure 18. The Research Collaboratory for Structural Bioinformatics Protein Data Bank is a worldwide archive of structural data of biological macromolecules, providing access to the 3-D structures of biological macromolecules, as well as integration with external biological resources, such as gene and drug databases. ProteomicsDB is another example, encompassing mass spectrometry of the human proteome acquired from human tissues, cell lines, and body fluid to facilitate the identification of organ-specific proteins and translated long intergenic noncoding RNAs, with due consideration of time-dependent expression patterns of proteins.
Parallel to these developments, the Human Metabolome Database consists of more than 40000 annotated metabolites entries in the latest version released in 2013. It provides both experimental metabolite concentration data and analyses through mass spectrometry and Nuclear Magnetic Resonance (NMR) spectrometry. Databases as such are believed to greatly facilitate the translation of information into knowledge for transforming clinical practice, particularly for metabolic-related diseases, such as diabetes and coronary artery diseases. In fact, metabolomics has emerged as an important research area that does not only include endogenous metabolites of the human body but also chemical and biochemical molecules that can interact with the human body. Specifically, ongoing efforts have been placed for fingerprinting metabolites from food and nutrition products, drugs, and traditional Chinese medicine, as well as molecules produced by the gut bacterial microbiota. These will eventually help us to better understand the interaction between the host, pathogen and environment.
The availability of the genomic, proteomic, and metabolic databases allows a better understanding of the development of complex diseases such as cancer. They also allow the search of new biomarkers using different pattern mining and clustering techniques. The clusters can be either partitional (hard) or hierarchical (tree-like nested structure). Using multicore CPU, GPU, and field-programmable gate arrays with parallel processing techniques can further accelerate these methods.
In two linked papers, Dudley et al. and Sirota et al. created disease signatures from microarray data in Gene Expression Omnibus and compared these to gene expression data from Connectivity Map to identify potentially novel therapeutics for lung cancer and inflammatory bowel disease. A similar study using this method, noted that tricyclic antidepressants might have efficacy against small cell lung cancer (but not non-small cell lung cancer).
Drug repurposing refers to taking an existing, already on the market, FDA-approved compound and using it to treat a disease or condition other than the one for which it was originally intended. In the past, inspiration for this type of ‘‘off label use” has been largely serendipitous. For example, Viagra was initially aimed at treating heart disease, and turned out to be useful for erectile dysfunction. By using a pre-approved compound, early phase clinical trials can be avoided, which can save significant time and money.
Disease-gene association data may also predict drug targets. Sanseau et al. evaluated existing GWAS hits and found that genes related to GWAS hits are significantly more likely to be targetable by small molecules or by biologic agents than other genomic regions, and that 15.6% of GWAS genes are existing drug targets (compared to 5.7% of the general genome). In support of this hypothesis, Okada et al. performed a multi-ethnic GWAS of 103,638 cases and controls for rheumatoid arthritis (RA) and noted 101 total RA risk loci; these loci identified 18 of 27 current RA drug target genes, and identified three approved cancer medications that may be active against RA. Khatri et al. analyzed eight existing organ transplant rejection datasets and found a common module of 11 genes overexpressed in all rejected organs. Using these genes, they identified two existing non-immunosuppressant drugs that could be repurposed to regulate these genes and demonstrated enhanced effect in a mouse model. Resources such as the drug-gene interaction database (DGI), which integrates data from 13 databases, and PharmGKB may facilitate translation of genomic study results to potential therapeutics. See the Table below for a listing of TBI resources.
Finally, an increasing collection of available computational and experimental methods that leverage molecular and clinical data enable diverse drug repositioning strategies. Integration of translational bioinformatics resources, statistical methods, chemoinformatics tools and experimental techniques (including medicinal chemistry techniques) can enable the rapid application of drug repositioning on an increasingly broad scale. Efficient tools are now available for systematic drug-repositioning methods using large repositories of compounds with biological activities. Medicinal chemists along with other translational researchers can play a key role in various aspects of drug repositioning.
Table 1. Public resources available for Translational Bioinformatics.
Name | URL | Comments |
---|---|---|
Pharmacogenomic Biomarkers in Drug Labels | http://www.fda.gov/drugs/ scienceresearch/researchareas/ pharmacogenetics/ucm083378.htm | Lists FDA-approved drugs with pharmacogenomic information in their drug labels. |
PharmGKB | http://www.pharmgkb.org | PharmGKB is a curated resource about the impact of genetic variation on drug response for clinicians and researchers. |
Clinical Pharmacogenetics Implementation Consortium (CPIC) | http://www.pharmgkb.org/page/cpic | Provides a list of the published guidelines for drug-gene interactions produced by CPIC. |
Phenotype Knowledgebase | http://phekb.org | Online collaborative repository for building, validating, and sharing electronic phenotype algorithms and their performance characteristics. |
NHGRI Catalog of GWAS studies | http://www.genome.gov/26525384 | Curated list of GWAS studies, their phenotypes, and key results. |
Catalog of PheWAS results | http://phewascatalog.org | Searchable, downloadable catalog of EHR PheWAS results. |
Drug-Gene Interaction database | http://dgidb.genome.wustl.edu | Provides a search interface into drug-gene interactions from data derived from 13 resources. |
My Cancer Genome | http://www.mycancergenome.org | Provides up-to-date data regarding cancer mutations, treatments, and relevant clinical trials. |
ClinVar | http://www.ncbi.nlm.nih.gov/clinvar/ | It provides up-do-date relationships among human variations and phenotypes along with supporting evidence. |
SHARPn | http://phenotypeportal.org | Collection of computable phenotype algorithms generated by SHARPn. |
Personalized genomic testing
Personalized medicine has become important as a means to help patients receive the best possible outcomes while reducing adverse effects and high direct medical costs if a treatment will not benefit the patient.
Genetic and genomic tests each have a place in personalized medicine. Genetic tests typically focus on a specific, known gene, while genomic tests, whole-genome sequencing (WGS), focus on expression and interaction of groups of genes. Genetic tests concentrate on the presence or absence of mutations, or overexpression, of individual genes, while genomic tests provide gene signature profiles based on expression levels of specific component genes. Examples of genetic tests include BRCA-1 and -2 in breast cancer, EGFR in non-small cell lung cancer, and BRAF in melanoma. Examples of genomic tests include the Oncotype DX assays in breast, colon, and prostate cancers, and the 70-gene assay in breast cancer. Since WGS was first developed, advances in technology have made the test easier, quicker, and less expensive. So easy, in fact, that it could become a routine test offered to healthy patients during primary care visits. However, it can be difficult to determine what the results of WGS mean.
What is genetic testing?
Genetic testing is the analysis of human DNA, RNA, or proteins to detect gene variants, changes in chromosomes, or proteins associated with certain diseases or conditions; non-diagnostic uses include paternity testing and forensics. The results of a genetic test can confirm or rule out a suspected genetic condition or help determine a person’s chance of developing or passing on a genetic disorder. More than 1,000 genetic tests are currently in use, and more are being developed.
Genetic testing methodology varies:
Molecular genetic tests study single genes or short lengths of DNA to identify variations or mutations that lead to a genetic disorder.
Chromosomal genetic tests analyze whole chromosomes or long lengths of DNA to detect large genetic changes such as an extra copy of a chromosome.
Finally, biochemical genetic tests study the amount or activity level of proteins; abnormalities in either can indicate changes to the DNA that result in a genetic disorder.
The Figure 6 summarizes the various applications of genetic testing available today. Genetic testing is voluntary, and it has benefits as well as limitations and risks. Thus, the decision about whether to be tested is a personal and complex one. A geneticist or genetic counselor can help by providing information about the pros and cons of the test and discussing the social and emotional aspects of testing.
The last decade has seen an unprecedented pace of advancement in our ability to sequence the genome. As the cost of sequencing decreases, the opportunity to move from targeted sequencing to whole exome sequencing (the analysis of all a person’s genes) and then to whole genome sequencing that analyzes a person’s entire genetic code becomes more accessible, particularly for researcher.
Figure 6. Available types of genetic testing.
Most medical genetic test results will directly change your medical care and those changes are based on evidence gathered through clinical trials and other medical practice. Medical genetic tests may be used to:
- Diagnose a genetic disease.
- Assess the chance of having a child with certain genetic conditions.
- Predict if a person may be more likely to have side effects or an abnormal response to a certain drug.
- Find an increased risk for a common disease.
For genomic assays to be a viable tool, they must be accurate and clinically meaningful. As below Table shows, genomic assays need to have analytic validity, clinical validity, and clinical utility. The analytic validity is the test’s ability to accurately and reliably measure the genotype (or analyte) of interest in the clinical laboratory and in specimens representative of the population of interest. Regarding clinical validation, a major goal is to identify and quantify potential sources of biologic variation in the analysis of a given sample. Clinical utility is a test’s ability to benefit patients by improving treatment decisions.
Table 2. Evidence Requirements for Genomic Assays: |
---|
Analytical validity: Ability to accurately and reproducibly measure analyte (or genotype). Does it detect what it is supposed to detect? |
Clinical validity: Ability to accurately and reliably predict phenotype, clinical disease, or predisposition to disease. Does it detect information that is known to be associated with a specific disease? |
Clinical utility: Evidence that guides patient management and affects decision making, resulting in added value and improved outcomes. How useful is the information to improve health outcomes? |
The rapid evolution of genomic sequencing technologies has decreased the cost of genetic analysis to the extent that it seems plausible that genome-scale sequencing could have widespread availability in health care across all stages of life - from preconception to adult medicine (Figure 7). Challenges to fully embracing genomics in a clinical setting remain, but some approaches are starting to overcome these barriers, such as community-driven data sharing to improve the accuracy and efficiency of applying genomics to patient care.
Early analyses comparing genomes of different individuals confirmed the remarkable similarities of sequence (99% identical), but soon gave way to expectations that the millions of nucleotide differences among different individuals would enable clinicians to not only recognize each individual’s biologic uniqueness, but to translate this knowledge into more precise understanding of physiology, more refined diagnoses, better disease risk assessment, earlier detection and monitoring, and tailored treatments to the individual patient; ie, personalized (or individualized or precision) medicine.
Figure 7. The use of genomics throughout an individual’s lifespan. Case studies of the use of genomics to inform patient care at different stages of life. (Rehm 2017)
Value of genomics in personalized medicine
Despite the use of DNA diagnostic testing prior to 2000, it has been the exponential increase in our capacity to perform nucleotide sequencing that has been largely responsible for the relatively recent emphasis on personalized medicine. Completion of the HapMap project allowed for selection of genome wide single nucleotide variants (SNVs) that would tag common variants throughout the genome. This enabled genome-wide association studies (GWASs) for discovery of loci associated with clinical phenotypes. Advances in next-generation sequencing (NGS) have reduced the cost and time required for whole exome sequencing (WES) or whole genome sequencing (WGS), and we are continually improving our capacity for handling the storage, transfer, and analyses of huge amounts of sequence data. Also, have enabled millions of people to have their individual genomic sequence analysed, primarily within the settings of research studies or clinical care. There is widespread recognition that access to an individual’s genomic sequence and other ‘omics’ data can enable a more detailed understanding of our health and disease risks, and inform a more precise approach to patient care, a strategy now commonly called ‘precision medicine’.
With genomic data now increasingly used to guide the individual care of patients, our health care systems are evolving, although several challenges remain. This Perspective considers how genomics is guiding health care for the individual, providing illustrative examples of how individuals are taking advantage of personal genomic information, ranging from advanced diagnostics and tumor profiling to genomic risk assessments. These examples are then interweaved with the day-to-day challenges still facing the integration of genomics into clinical practice as well as with strategies that are being developed to overcome these barriers and enable genomics to be a part of ever more aspects of everyday patient care.
Trends in Personal Genomic Testing to Guide Health Care
In 2008 saw the founding of several companies that offered direct-to-consumer (DTC) genetic testing, reporting on a variety of genes for both health and recreational purposes. Direct-To-Consumer (DTC) genetic testing through sites such as 23andMe (Mountain View, CA) has provided an avenue for patients to pursue genetic testing outside of a doctor’s order. Individuals received test results and personalized information on their genetic ancestry, disease risk, and drug response for selected medications.
DTC genetic testing raises a number of interesting ethical, legal, and social issues. For several years, there was an open question as to whether or not these tests should be subject to government regulation. In November 2013, the US FDA ordered 23andMe to stop advertising and offering their health-related information services. The FDA considered these tests to be ‘‘medical devices” and as such to require formal testing and FDA approval for each test. In February 2015, it was announced that the FDA had approved 23andMe’s application for a test for Bloom syndrome (http://www.fda.gov/News Events/Newsroom/PressAnnouncements/UCM435003), and in October 2015 it was announced that the company would once again be offering health information in the form of carrier status for 36 genes. Note that a 23andMe customer is able to download his or her raw genomic data and to use information from other websites to interpret the results, including Promethease, Geneticgenie, openSNP, and Interpretome for health-related associations.
A more positive example of where genetic testing is helping patients is a case presented at the American Neurological Association conference in 2014. A patient had a history of Alzheimer’s disease on her mother’s side of the family. She did not know if she was a carrier, nor did she want to know. But she wanted to ensure that she did not pass that mutation to her future children. Preimplantation genetic diagnosis (PGD) testing enabled her doctors to select embryos that did not have that Alzheimer’s disease gene mutation. The patient herself was never tested, nor was she informed how many (if any) of the embryos contained the mutation.
Table 3. Examples of personal genetic profiling tests for disease susceptibility.
Company | Example product | Details |
---|---|---|
23andMe | Health Edition | “Find out if you carry inheritable markers for diseases such as breast cancer, cystic fibrosis, and Tay-Sachs...Learn your genetic risk for type 2 diabetes, Parkinson's disease, and other conditions. |
deCODEme | Complete Scan | “Calculate your genetic risk for 51 conditions...” |
Genetic Health | Premium Male | “These are our most comprehensive test and includes all the other tests in our range... Evaluates the risk of prostate cancer as well as the risk for thrombosis, osteoporosis, metabolic imbalances of detoxification and chronic inflammation. It also evaluates the risk profile of the most common cardiovascular diseases...” |
Graceful Earth | Alzheimer’s genome test | “Check your future susceptibility BEFORE symptoms occur... Pre-emptive insight into one's genetic predisposition can empower and allow for pro-active prevention.” |
Navigenics | Health Compass | “Knowing your genetic predispositions for important health conditions and medication reactions can help motivate you to take steps towards a healthier life. By gaining insight into these risks, you can plan for what's important.” |
Also, Universal newborn screening (NBS) is an extraordinarily successful public health program, preventing morbidity and mortality through early diagnosis and management of conditions including rare inborn errors of metabolism. Conditions such as phenylketonuria are not clinically evident at birth but lead to significant irreversible harm or death if not treated promptly. NBS has saved countless lives and vastly improved the quality of children’s lives by allowing timely therapeutic interventions, and technological advances such as the use of tandem mass spectrometry (MS/MS) have played a significant role in expansion of NBS. The capacity of genome-scale sequencing for disease gene discovery is increasingly being applied as a diagnostic test in children with suspected monogenic disorders.
The ability to analyze many or all genes in the genome simultaneously provides new opportunities for genomic medicine. The clinical utility of sequencing is recognized for certain diseases and in an increasing number of medical specialties, with genetic and genomic medicine offering the promise of improved diagnostics and treatments – and patients asking physicians about the applicability of these technologies for their own care. However, some experts caution the roadmap for translating genetics and genomics into routine clinical practice is unclear.
Computational health informatics
Computational health informatics (CHI) is an emerging research topic within and beyond the medical industry. It is a multidisciplinary field involving various sciences such as biomedical, medical, nursing, information technology, computer science, and statistics. CHI is a computer science branch that addresses how computational methods relate to providing health care. Using Information and Communication Technologies (ICTs), health informatics collects and analyzes the information from all healthcare domains to predict patients’ health status. The major goal of health informatics research is to improve the quality of care provided to patients or Health Care Output (HCO). The healthcare industry has experienced rapid growth of medical and healthcare data in recent years. Figure 8 depicts the growth of both healthcare data and digital healthcare data. It is projected that the healthcare data analytics market will increase and grow 8-10 times as fast as the overall economy until 2017.
Figure 8. Healthcare data growth. (Fang et al. 2016)
The rapid growth of new technologies has led to a significant increase of digital health data in recent years. More medical discoveries and new technologies such as novel sensors, mobile apps, capturing devices, wearable technology have contributed to additional data sources. Therefore, the healthcare industry produces a huge amount of digital data by utilizing information from all sources of healthcare data such as Electronic Health Records (EHR, including electronic medical records) and personal health records (PHR, one subset of EHR including medical history, laboratory results, and medications). Based on reports, digital healthcare data from all over the world was estimated to be equal to 500 petabytes (1015) in 2012 and it is expected to reach 25 exabytes in 2020 as shown in Figure 23b.
The digital health data is not only enormous in amount, but also complex in its structure for traditional software and hardware. Some of the contributing factors to the failure of traditional systems in handling these datasets include:
- The vast variety of structured and unstructured data such as medical records, hand-written doctor notes, medical diagnostic images (MRI, CT), and radiographic films.
- Existence of noisy, heterogeneous, complex, diverse, longitudinal, and large datasets in healthcare informatics.
- Difficulties to capture, store, analyze and visualize such large and complex datasets.
- Necessity of increasing the storage capacity, computation power and the processing power.
- Improving the quality of care, security of patients’ data, sharing, and the reduction of the healthcare cost.
Hence, solutions are needed in order to manage and analyze such complex, diverse and huge datasets in a reasonable time complexity and storage capacity. Big data analytics, a popular term given to datasets which are large and complex, play a vital role in managing the huge healthcare data and improving the quality of healthcare offered to patients. In addition, it promises a bright prospect for decreasing the cost of care, improving treatments, reaching more personalized medicine, and helping doctors and physicians to make personalized decisions.
Finally, the major benefits of big data analytics in healthcare are as follow:
- It makes use of the huge volume of data and provides timely and effective treatment to patients.
- It provides personalized care to patients.
- It will benefit all the components of a medical system (i.e., provider, payer, patient, and management).
References
- Altman, R.B., 2012. Translational Bioinformatics: Linking the Molecular World to the Clinical World. Clinical Pharmacology & Therapeutics, 91(6), pp.994–1000. Available at: http://doi.wiley.com/10.1038/clpt.2012.49.
- Andreu-Perez, J., Poon, C.C.Y., et al., 2015. Big data for health. IEEE journal of biomedical and health informatics, 19(4), pp.1193–208. Available at: http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7154395.
- Andreu-Perez, J., Leff, D.R., et al., 2015. From Wearable Sensors to Smart Implants--Toward Pervasive and Personalized Healthcare. IEEE transactions on bio-medical engineering, 62(12), pp.2750–62. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25879838.
- Anhøj, J., 2003. Generic Design of Web-Based Clinical Databases. Journal of Medical Internet Research, 5(4), p.e27. Available at: http://www.jmir.org/2003/4/e27/.
- Aronson, S.J. & Rehm, H.L., 2015. Building the foundation for genomics in precision medicine. Nature, 526(7573), pp.336–42. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26469044.
- Bain, J.R. et al., 2009. Metabolomics applied to diabetes research: moving from information to knowledge. Diabetes, 58(11), pp.2429–43. Available at: http://www.ncbi.nlm.nih.gov/pubmed/19875619.
- Ban, T.A., 2006. The role of serendipity in drug discovery. Dialogues in clinical neuroscience, 8(3), pp.335–44. Available at: http://www.ncbi.nlm.nih.gov/pubmed/17117615.
- Baro, E. et al., 2015. Toward a Literature-Driven Definition of Big Data in Healthcare. BioMed Research International, 2015(1), pp.1–9. Available at: http://www.ncbi.nlm.nih.gov/pubmed/6137488.
- Barretina, J. et al., 2012. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature, 483(7391), pp.603–7. Available at: http://www.nature.com/nature/journal/v483/n7391/full/nature11003.html%3FWT.ec_id%3DNATURE-20120329.
- Baskar, S. & Aziz, P.F., 2015. Genotype-phenotype correlation in long QT syndrome. Global Cardiology Science and Practice, 2015(2), p.26. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4614326&tool=pmcentrez&rendertype=abstract.
- Bates, D.W. et al., 2014. Big Data In Health Care: Using Analytics To Identify And Manage High-Risk And High-Cost Patients. Health Affairs, 33(7), pp.1123–1131. Available at: http://content.healthaffairs.org/cgi/doi/10.1377/hlthaff.2014.0041.
- Benson, G., 2015. Editorial: Nucleic Acids Research annual Web Server Issue in 2015. Nucleic Acids Research, 43(Web Server issue), pp.W1–W2. Available at: https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkv581.
- Berg, J.S. et al., 2017. Newborn Sequencing in Genomic Medicine and Public Health. Pediatrics, 139(2). Available at: http://www.ncbi.nlm.nih.gov/pubmed/28096516.
- Bhatia, D., 2015. Medical Informatics PHI Learni., PHI Learnign Private Limited, Delhi: PHI Learnign Private Limited, Delhi.
- Boland, M.R. et al., 2013. Discovering medical conditions associated with periodontitis using linked electronic health records. Journal of clinical periodontology, 40(5), pp.474–82. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23495669.
- Cancer Genome Atlas Network, 2012. Comprehensive molecular portraits of human breast tumours. Nature, 490(7418), pp.61–70. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23000897.
- Cars, T. et al., 2013. Extraction of electronic health record data in a hospital setting: comparison of automatic and semi-automatic methods using anti-TNF therapy as model. Basic & clinical pharmacology & toxicology, 112(6), pp.392–400. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23374887.
- Chen, J. et al., 2013. Translational Biomedical Informatics in the Cloud: Present and Future. BioMed Research International, 2013, pp.1–8. Available at: http://www.hindawi.com/journals/bmri/2013/658925/.
- Choi, I.Y. et al., 2013. Perspectives on clinical informatics: integrating large-scale clinical, genomic, and health information for clinical care. Genomics & informatics, 11(4), pp.186–90. Available at: http://dx.doi.org/10.5808/GI.2013.11.4.186.
- Christakis, N.A. & Fowler, J.H., 2008. The collective dynamics of smoking in a large social network. The New England journal of medicine, 358(21), pp.2249–58. Available at: http://www.ncbi.nlm.nih.gov/pubmed/18499567.
- Christensen, K.D. et al., 2016. Are physicians prepared for whole genome sequencing? a qualitative analysis. Clinical genetics, 89(2), pp.228–34. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26080898.
- Collins, F.S. & Varmus, H., 2015. A New Initiative on Precision Medicine. New England Journal of Medicine, 372(9), pp.793–795. Available at: http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:New+engla+nd+journal\#0.
- Collymore, D.C. et al., 2016. Genomic testing in oncology to improve clinical outcomes while optimizing utilization: the evolution of diagnostic testing. The American journal of managed care, 22(2 Suppl), pp.s20-5. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26978033.
- Costello, J.C. et al., 2014. A community effort to assess and improve drug sensitivity prediction algorithms. Nature biotechnology, 32(12), pp.1202–12. Available at: http://www.ncbi.nlm.nih.gov/pubmed/24880487.
- Danciu, I. et al., 2014. Secondary use of clinical data: the Vanderbilt approach. Journal of biomedical informatics, 52(1), pp.28–35. Available at: http://dx.doi.org/10.1016/j.jbi.2014.02.003.
- Delaney, S.K. et al., 2016. Toward clinical genomics in everyday medicine: perspectives and recommendations. Expert review of molecular diagnostics, 16(5), pp.521–32. Available at: https://www.tandfonline.com/doi/full/10.1586/14737159.2016.1146593.
- Denny, J.C., 2014. Surveying Recent Themes in Translational Bioinformatics: Big Data in EHRs, Omics for Drugs, and Personal Genomics. IMIA Yearbook, 9(1), pp.199–205. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25123743.
- Dudley, J.T. et al., 2011. Computational repositioning of the anticonvulsant topiramate for inflammatory bowel disease. Science translational medicine, 3(96), p.96ra76. Available at: http://www.ncbi.nlm.nih.gov/pubmed/21849664.
- Dugas, M., 2015. Clinical Research Informatics: Recent Advances and Future Directions. Yearbook of medical informatics, 10(1), pp.174–7. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26293865%5Cnhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC4587057.
- Eichstaedt, J.C. et al., 2015. Psychological language on Twitter predicts county-level heart disease mortality. Psychological science, 26(2), pp.159–69. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25605707.
- Embi, P.J., 2013. Clinical research informatics: survey of recent advances and trends in a maturing field. Yearbook of medical informatics, 8, pp.178–84. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23974569.
- Embi, P.J. & Payne, P.R.O., 2009. Clinical research informatics: challenges, opportunities and definition for an emerging domain. Journal of the American Medical Informatics Association : JAMIA, 16(3), pp.316–27. Available at: http://dx.doi.org/10.1197/jamia.M3005.
- Eriksson, R. et al., 2014. Dose-specific adverse drug reaction identification in electronic patient records: temporal data mining in an inpatient psychiatric population. Drug safety, 37(4), pp.237–47. Available at: http://www.ncbi.nlm.nih.gov/pubmed/24634163.
- Fang, R. et al., 2016. Computational Health Informatics in the Big Data Age. ACM Computing Surveys, 49(1), pp.1–36. Available at: http://dl.acm.org/citation.cfm?doid=2911992.2932707.
- Fernández-Suárez, X.M. & Galperin, M.Y., 2013. The 2013 nucleic acids research database issue and the online molecular biology database collection. Nucleic Acids Research, 41(D1), pp.1–7.
- Forrest, G.N. et al., 2014. Use of electronic health records and clinical decision support systems for antimicrobial stewardship. Clinical infectious diseases, 59(Suppl 3), pp.S122-33. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25261539.
- Friedl, M.A. et al., 2010. MODIS Collection 5 global land cover: Algorithm refinements and characterization of new datasets. Remote Sensing of Environment, 114(1), pp.168–182. Available at: http://dx.doi.org/10.1016/j.rse.2009.08.016.
- Friedman, C. et al., 2004. Automated Encoding of Clinical Documents Based on Natural Language Processing. Journal of the American Medical Informatics Association, 11(5), pp.392–402. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1197/jamia.M1552.
- Friedman, C.P., 2009. A “fundamental theorem” of biomedical informatics. Journal of the American Medical Informatics Association : JAMIA, 16(2), pp.169–70. Available at: http://dx.doi.org/10.1197/jamia.M3092.
- Galperin, M.Y. & Fernandez-Suarez, X.M., 2012. The 2012 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection. Nucleic Acids Research, 40(D1), pp.D1–D8. Available at: https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gks1297.
- Gao, W. et al., 2016. Fully integrated wearable sensor arrays for multiplexed in situ perspiration analysis. Nature, 529(7587), pp.509–514. Available at: http://www.ncbi.nlm.nih.gov/pubmed/26819044.
- Garnett, M.J. et al., 2012. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature, 483(7391), pp.570–5. Available at: http://www.nature.com/doifinder/10.1038/nature11005.
- Geeleher, P., Cox, N.J. & Huang, R., 2014. Clinical drug response can be predicted using baseline gene expression levels and in vitro drug sensitivity in cell lines. Genome Biology, 15(3), p.R47. Available at: http://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-3-r47.
- Griffith, M. et al., 2013. DGIdb: mining the druggable genome. Nature methods, 10(12), pp.1209–10. Available at: http://www.ncbi.nlm.nih.gov/pubmed/24122041.
- Hagar, Y. et al., 2014. Survival analysis with electronic health record data: Experiments with chronic kidney disease. Statistical Analysis and Data Mining: The ASA Data Science Journal, 7(5), pp.385–403. Available at: http://doi.wiley.com/10.1002/sam.11236.
- Haghi, M., Thurow, K. & Stoll, R., 2017. Wearable Devices in Medical Internet of Things: Scientific Research and Commercially Available Devices. Healthcare Informatics Research, 23(1), p.4. Available at: https://synapse.koreamed.org/DOIx.php?id=10.4258/hir.2017.23.1.4.
- Hall, M.J. et al., 2015. Understanding patient and provider perceptions and expectations of genomic medicine. Journal of Surgical Oncology, 111(1), pp.9–17. Available at: http://doi.wiley.com/10.1002/jso.23712.
- den Hartog, A.W. et al., 2015. The risk for type B aortic dissection in Marfan syndrome. Journal of the American College of Cardiology, 65(3), pp.246–54. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25614422.
- Hayes, D.F., Khoury, M.J. & Ransohoff, D., 2012. Why Hasn’t Genomic Testing Changed the Landscape in Clinical Oncology? American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Meeting, 1, pp.e52-5. Available at: http://www.ncbi.nlm.nih.gov/pubmed/24451831.
- Hayward, J. et al., 2017. Genomics in routine clinical care: what does this mean for primary care? British Journal of General Practice, 67(655), pp.58–59. Available at: http://bjgp.org/lookup/doi/10.3399/bjgp17X688945.
- He, K.Y., Ge, D. & He, M.M., 2017. Big Data Analytics for Genomic Medicine. International journal of molecular sciences, 18(2), p.412. Available at: http://www.mdpi.com/1422-0067/18/2/412.
- Herland, M., Khoshgoftaar, T.M. & Wald, R., 2014. A review of data mining using big data in health informatics. Journal Of Big Data, 1(1), p.2. Available at: http://www.journalofbigdata.com/content/1/1/2.
- Hersh, W., 2009. A stimulus to define informatics and health information technology. BMC medical informatics and decision making, 9(1), p.24. Available at: http://www.biomedcentral.com/1472-6947/9/24.
- Hijmans, R.J. et al., 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology, 25(15), pp.1965–1978. Available at: http://doi.wiley.com/10.1002/joc.1276.
- Hoyt, R.E., Sutton, M. & Yoshihashi, A., 2009. Medical Informatics Practical Guide for the Healthcare Professional,
- Huser, V. & Cimino, J.J., 2013. Desiderata for healthcare integrated data repositories based on architectural comparison of three public repositories. AMIA ... Annual Symposium proceedings. AMIA Symposium, 2013(1), pp.648–56. Available at: http://www.ncbi.nlm.nih.gov/pubmed/24551366%5Cnhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC3900207.
- Hutchinson, S. et al., 2003. Allelic variation in normal human FBN1 expression in a family with Marfan syndrome: a potential modifier of phenotype? Human molecular genetics, 12(18), pp.2269–76. Available at: http://www.ncbi.nlm.nih.gov/pubmed/12915484.
- Iyer, G. et al., 2012. Genome sequencing identifies a basis for everolimus sensitivity. Science (New York, N.Y.), 338(6104), p.221. Available at: http://www.ncbi.nlm.nih.gov/pubmed/22923433.
- Jennings, L. et al., 2009. Recommended principles and practices for validating clinical molecular pathology tests. Archives of pathology & laboratory medicine, 133(5), pp.743–55. Available at: http://www.ncbi.nlm.nih.gov/pubmed/19415949.
- Jensen, P.B., Jensen, L.J. & Brunak, S., 2012. Mining electronic health records: towards better research applications and clinical care. Nature reviews. Genetics, 13(6), pp.395–405. Available at: http://www.nature.com/doifinder/10.1038/nrg3208.
- Kahn, M.G. et al., 2012. A pragmatic framework for single-site and multisite data quality assessment in electronic health record-based clinical research. Medical care, 50 Suppl(0), pp.S21-9. Available at: http://www.ncbi.nlm.nih.gov/pubmed/22692254.
- Kahn, M.G. & Weng, C., 2012. Clinical research informatics: a conceptual perspective. Journal of the American Medical Informatics Association, 19(e1), pp.e36–e42. Available at: http://www.scopus.com/inward/record.url?eid=2-s2.0-84863552740&partnerID=tZOtx3y1.
- Kamesh, D.B.K., Neelima, V. & Ramya Priya, R., 2015. A review of data mining using big data in health informatics. International Journal of Scientific and Research Publications, 5(3), pp.1–7. Available at: http://www.ijsrp.org/research-paper-0315/ijsrp-p3913.pdf.
- Karczewski, K.J., Daneshjou, R. & Altman, R.B., 2012. Chapter 7: Pharmacogenomics F. Lewitter & M. Kann, eds. PLoS Computational Biology, 8(12), p.e1002817. Available at: http://dx.plos.org/10.1371/journal.pcbi.1002817.
- Khatri, P. et al., 2013. A common rejection module (CRM) for acute rejection across multiple organs identifies novel therapeutics for organ transplantation. The Journal of experimental medicine, 210(11), pp.2205–21. Available at: http://www.jem.org/lookup/doi/10.1084/jem.20122709.
- Kouskoumvekaki, I., Shublaq, N. & Brunak, S., 2014. Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics. Briefings in bioinformatics, 15(6), pp.942–52. Available at: https://academic.oup.com/bib/article-lookup/doi/10.1093/bib/bbt055.
- Kovats, R.S. & Hajat, S., 2008. Heat stress and public health: a critical review. Annual review of public health, 29(1), pp.41–55. Available at: http://www.annualreviews.org/doi/10.1146/annurev.publhealth.29.020907.090843.
- Kreso, A. et al., 2013. Variable clonal repopulation dynamics influence chemotherapy response in colorectal cancer. Science (New York, N.Y.), 339(6119), pp.543–8. Available at: http://www.sciencemag.org/cgi/doi/10.1126/science.1227670.
- Ku Jena, R. et al., 2009. Soft Computing Methodologies in Bioinformatics. European Journal of Scientific Research, 26(2), pp.189–203.
- Laakko, T. et al., 2008. Mobile health and wellness application framework. Methods of information in medicine, 47(3), pp.217–22. Available at: http://www.schattauer.de/index.php?id=1214&doi=10.3414/ME9113.
- Lamb, J. et al., 2006. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science (New York, N.Y.), 313(5795), pp.1929–35. Available at: http://www.sciencemag.org/cgi/doi/10.1126/science.1132939.
- Larsen, M.E. et al., 2015. We Feel: Mapping Emotion on Twitter. IEEE Journal of Biomedical and Health Informatics, 19(4), pp.1246–1252. Available at: http://ieeexplore.ieee.org/document/7042256/.
- Lee, J., Kuo, Y.-F. & Goodwin, J.S., 2013. The effect of electronic medical record adoption on outcomes in US hospitals. BMC health services research, 13(1), p.39. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23375071%5Cnhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC3568047.
- Lobitz, B. et al., 2000. Climate and infectious disease: use of remote sensing for detection of Vibrio cholerae by indirect measurement. Proceedings of the National Academy of Sciences of the United States of America, 97(4), pp.1438–43. Available at: http://www.ncbi.nlm.nih.gov/pubmed/10677480.
- Londin, E.R. & Barash, C.I., 2015. What is translational bioinformatics? Applied & Translational Genomics, 6, pp.1–2. Available at: http://linkinghub.elsevier.com/retrieve/pii/S2212066115000174.
- Luber, G. & McGeehin, M., 2008. Climate change and extreme heat events. American journal of preventive medicine, 35(5), pp.429–35. Available at: http://www.ncbi.nlm.nih.gov/pubmed/18929969.
- Lussier, Y.A. & Liu, Y., 2007. Computational approaches to phenotyping: high-throughput phenomics. Proceedings of the American Thoracic Society, 4(1), pp.18–25. Available at: http://pats.atsjournals.org/cgi/doi/10.1513/pats.200607-142JG.
- MacKenzie, S.L. et al., 2012. Practices and perspectives on building integrated data repositories: results from a 2010 CTSA survey. Journal of the American Medical Informatics Association, 19(e1), pp.e119–e124. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2011-000508.
- Marcos, M. et al., 2013. Interoperability of clinical decision-support systems and electronic health records using archetypes: a case study in clinical trial eligibility. Journal of biomedical informatics, 46(4), pp.676–89. Available at: http://dx.doi.org/10.1016/j.jbi.2013.05.004.
- Mccauley, M.P. et al., 2017. Genetics and Genomics in Clinical Practice : The Views of Wisconsin Physicians. WMJ, 116(2), pp.69–75.
- McMurry, A.J. et al., 2013. SHRINE: enabling nationally scalable multi-site disease studies. PloS one, 8(3), p.e55811. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23533569.
- Moltchanov, S. et al., 2015. On the feasibility of measuring urban air pollution by wireless distributed sensor networks. The Science of the total environment, 502, pp.537–47. Available at: http://dx.doi.org/10.1016/j.scitotenv.2014.09.059.
- Murphy, S.N. et al., 2010. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). Journal of the American Medical Informatics Association : JAMIA, 17(2), pp.124–30. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/jamia.2009.000893.
- Nadkarni, P.M. & Brandt, C., 1998. Data Extraction and Ad Hoc Query of an Entity--Attribute--Value Database. Journal of the American Medical Informatics Association, 5(6), pp.511–527. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/jamia.1998.0050511.
- Nagalla, S. & Bray, P.F., 2016. Personalized medicine in thrombosis: back to the future. Blood, 127(22), pp.2665–2671. Available at: http://linkinghub.elsevier.com/retrieve/pii/S2468171717300029.
- Nunes, M. et al., 2015. Evaluating patient-derived colorectal cancer xenografts as preclinical models by comparison with patient clinical data. Cancer research, 75(8), pp.1560–6. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25712343.
- Okada, Y. et al., 2014. Genetics of rheumatoid arthritis contributes to biology and drug discovery. Nature, 506(7488), pp.376–81. Available at: http://www.ncbi.nlm.nih.gov/pubmed/24390342.
- Oyelade, J. et al., 2015. Bioinformatics, Healthcare Informatics and Analytics: An Imperative for Improved Healthcare System. International Journal of Applied Information Systems, 8(5), pp.1–6. Available at: http://research.ijais.org/volume8/number5/ijais15-451318.pdf.
- Pandey, A.S. & Divyasheesh, V., 2016. Applications of Bioinformatics in Medical Renovation and Research. International Journal of Advanced Research in Computer Science and Software Engineering, 6(3), pp.56–58.
- Payne, P.R.O., Embi, P.J. & Sen, C.K., 2009. Translational informatics: enabling high-throughput research paradigms. Physiological Genomics, 39(3), pp.131–140. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2789669&tool=pmcentrez&rendertype=abstract.
- Plunkett-Rondeau, J., Hyland, K. & Dasgupta, S., 2015. Training future physicians in the era of genomic medicine: trends in undergraduate medical genetics education. Genetics in medicine : official journal of the American College of Medical Genetics, 17(11), pp.927–34. Available at: http://www.nature.com/doifinder/10.1038/gim.2014.208.
- Poon, C.C.Y. & Zhang, Y.-T., 2008. Perspectives on high technologies for low-cost healthcare. IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society, 27(5), pp.42–7. Available at: http://www.ncbi.nlm.nih.gov/pubmed/18799389.
- Prahallad, A. et al., 2012. Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR. Nature, 483(7387), pp.100–3. Available at: http://www.nature.com/doifinder/10.1038/nature10868.
- Raghupathi, W. & Raghupathi, V., 2014. Big data analytics in healthcare: promise and potential. Health information science and systems, 2(1), p.3. Available at: http://www.hissjournal.com/content/2/1/3.
- Ram, S. et al., 2015. Predicting Asthma-Related Emergency Department Visits Using Big Data. IEEE Journal of Biomedical and Health Informatics, 19(4), pp.1216–1223. Available at: http://ieeexplore.ieee.org/document/7045443/.
- Ramachandran, A. et al., 2013. Effectiveness of mobile phone messaging in prevention of type 2 diabetes by lifestyle modification in men in India: a prospective, parallel-group, randomised controlled trial. The lancet. Diabetes & endocrinology, 1(3), pp.191–8. Available at: http://dx.doi.org/10.1016/S2213-8587(13)70067-6.
- Ramírez, M.R. et al., 2018. Big Data and Health “Clinical Records.” In Innovation in Medicine and Healthcare 2017. Springer International Publishing AG 2018, pp. 12–18. Available at: http://link.springer.com/10.1007/978-3-319-39687-3.
- Rehm, H.L., 2017. Evolving health care through personal genomics. Nature reviews. Genetics, 18(4), pp.259–267. Available at: http://www.nature.com/doifinder/10.1038/nrg.2016.162.
- Relling, M. V. & Evans, W.E., 2015. Pharmacogenomics in the clinic. Nature, 526(7573), pp.343–350. Available at: http://www.nature.com/doifinder/10.1038/nature15817.
- Richesson, R.L. & Andrews, J.E., 2012. Introduction to Clinical Research Informatics. In Health Informatics. Springer-Verlag London Limited 2012, pp. 3–16. Available at: http://link.springer.com/10.1007/978-1-84882-448-5_1.
- Rose, P.W. et al., 2011. The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Research, 39(Database), pp.D392–D401. Available at: https://academic.oup.com/nar/article-lookup/doi/10.1093/nar/gkq1021.
- Rose, P.W. et al., 2015. The RCSB Protein Data Bank: views of structural biology for basic and applied research and education. Nucleic acids research, 43(Database issue), pp.D345-56. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25428375.
- Ross, M.K., Wei, W. & Ohno-Machado, L., 2014. “Big data” and the electronic health record. Yearbook of medical informatics, 9(1), pp.97–104. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25123728%5Cnhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=PMC4287068.
- Sáez, C. et al., 2017. A Standardized and Data Quality Assessed Maternal-Child Care Integrated Data Repository for Research and Monitoring of Best Practices: A Pilot Project in Spain. Studies in health technology and informatics, 235, pp.539–543. Available at: http://www.ncbi.nlm.nih.gov/pubmed/28423851.
- Safran, C. et al., 2007. Toward a National Framework for the Secondary Use of Health Data: An American Medical Informatics Association White Paper. Journal of the American Medical Informatics Association, 14(1), pp.1–9. Available at: http://jamia.oxfordjournals.org/content/14/1/1.full.
- Sanseau, P. et al., 2012. Use of genome-wide association studies for drug repositioning. Nature biotechnology, 30(4), pp.317–20. Available at: http://www.nature.com/doifinder/10.1038/nbt.2151.
- Scanfeld, D., Scanfeld, V. & Larson, E.L., 2010. Dissemination of health information through social networks: twitter and antibiotics. American journal of infection control, 38(3), pp.182–8. Available at: http://www.ncbi.nlm.nih.gov/pubmed/20347636.
- Schadt, E.E., 2012. The changing privacy landscape in the era of big data. Molecular Systems Biology, 8(612), pp.1–3. Available at: http://msb.embopress.org/cgi/doi/10.1038/msb.2012.47.
- Schaffer, J.D., Dimitrova, N. & Zhang, M., 2006. Chapter 26 BIOINFORMATICS. In Advances in Healthcare Technology. pp. 421–438.
- Semenza, J.C. & Menne, B., 2009. Climate change and infectious diseases in Europe. The Lancet. Infectious diseases, 9(6), pp.365–75. Available at: http://dx.doi.org/10.1016/S1473-3099(09)70104-5.
- Shameer, K. et al., 2017. Translational bioinformatics in the era of real-time biomedical, health care and wellness data streams. Briefings in Bioinformatics, 18(1), pp.105–124. Available at: https://academic.oup.com/bib/article-lookup/doi/10.1093/bib/bbv118.
- Shameer, K., Readhead, B. & Dudley, J.T., 2015. Computational and experimental advances in drug repositioning for accelerated therapeutic stratification. Current topics in medicinal chemistry, 15(1), pp.5–20. Available at: http://www.ncbi.nlm.nih.gov/pubmed/25579574.
- Sheehan, J. et al., 2016. Improving the value of clinical research through the use of Common Data Elements. Clinical Trials, 13(6), pp.671–676. Available at: http://journals.sagepub.com/doi/10.1177/1740774516653238.
- Signorini, A., Segre, A.M. & Polgreen, P.M., 2011. The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic. PloS one, 6(5), p.e19467. Available at: http://www.ncbi.nlm.nih.gov/pubmed/21573238.
- Silva, B.M.C. et al., 2015. Mobile-health: A review of current state in 2015. Journal of biomedical informatics, 56, pp.265–72. Available at: http://dx.doi.org/10.1016/j.jbi.2015.06.003.
- Simon, R., 2005. Roadmap for developing and validating therapeutically relevant genomic classifiers. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 23(29), pp.7332–41. Available at: http://www.ncbi.nlm.nih.gov/pubmed/16145063.
- Sirota, M. et al., 2011. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Science translational medicine, 3(96), p.96ra77. Available at: http://www.ncbi.nlm.nih.gov/pubmed/21849665.
- Sun, J. et al., 2014. Predicting changes in hypertension control using electronic health records from a chronic disease management program. Journal of the American Medical Informatics Association, 21(2), pp.337–344. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2013-002033.
- Taglang, G. & Jackson, D.B., 2016. Use of “big data” in drug discovery and clinical trials. Gynecologic oncology, 141(1), pp.17–23. Available at: http://dx.doi.org/10.1016/j.ygyno.2016.02.022.
- Tenenbaum, J.D., 2016. Translational Bioinformatics: Past, Present, and Future. Genomics, proteomics & bioinformatics, 14(1), pp.31–41. Available at: http://dx.doi.org/10.1016/j.gpb.2016.01.003.
- Teutsch, S.M. et al., 2009. The Evaluation of Genomic Applications in Practice and Prevention (EGAPP) Initiative: methods of the EGAPP Working Group. Genetics in medicine : official journal of the American College of Medical Genetics, 11(1), pp.3–14. Available at: http://www.ncbi.nlm.nih.gov/pubmed/18813139.
- The Eurowinter Group, 1997. Cold exposure and winter mortality from ischaemic heart disease, cerebrovascular disease, respiratory disease, and all causes in warm and cold regions of Europe. The Eurowinter Group. Lancet (London, England), 349(9062), pp.1341–6. Available at: http://www.ncbi.nlm.nih.gov/pubmed/9149695.
- Toh, S. et al., 2011. Comparative-effectiveness research in distributed health data networks. Clinical pharmacology and therapeutics, 90(6), pp.883–7. Available at: http://doi.wiley.com/10.1038/clpt.2011.236.
- Toubiana, L. & Cuggia, M., 2014. Big Data and Smart Health Strategies: Findings from the Health Information Systems Perspective. IMIA Yearbook, 9(1), pp.125–127. Available at: http://www.schattauer.de/en/magazine/subject-areas/journals-a-z/imia-yearbook/archive/issue/1973/manuscript/22305.html.
- Vilardell, M., Civit, S. & Herwig, R., 2013. An integrative computational analysis provides evidence for FBN1-associated network deregulation in trisomy 21. Biology open, 2(8), pp.771–8. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3744068&tool=pmcentrez&rendertype=abstract.
- Vodopivec-Jamsek, V. et al., 2012. Mobile phone messaging for preventive health care. The Cochrane database of systematic reviews, 12(12), p.CD007457. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23235643.
- Wade, T.D. et al., 2014. Using patient lists to add value to integrated data repositories. Journal of biomedical informatics, 52, pp.72–7. Available at: http://dx.doi.org/10.1016/j.jbi.2014.02.010.
- Wade, T.D., Hum, R.C. & Murphy, J.R., 2011. A Dimensional Bus model for integrating clinical and research data. Journal of the American Medical Informatics Association, 18(Supplement 1), pp.i96–i102. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2011-000339.
- Walker, K.L. et al., 2014. Using the CER Hub to ensure data quality in a multi-institution smoking cessation study. Journal of the American Medical Informatics Association, 21(6), pp.1129–1135. Available at: http://jamia.oxfordjournals.org/cgi/doi/10.1136/amiajnl-2013-002629%5Cnhttp://www.scopus.com/inward/record.url?eid=2-s2.0-84929044127&partnerID=40&md5=2c0c1e46853824a8779ebce39d9aabd8.
- Wang, X. & Liotta, L., 2011. Clinical bioinformatics: a new emerging science. Journal of clinical bioinformatics, 1(1), p.1. Available at: http://www.jclinbioinformatics.com/content/1/1/1.
- Weber, G.M. et al., 2011. Direct2Experts: a pilot national network to demonstrate interoperability among research-networking platforms. Journal of the American Medical Informatics Association, 18(Supplement 1), pp.i157–i160. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2011-000200.
- Weiner, M.G. & Embi, P.J., 2009. Toward reuse of clinical data for research and quality improvement: the end of the beginning? Ann Intern Med, 151(5), pp.359–360.
- Weinstein, J.N. et al., 2013. The Cancer Genome Atlas Pan-Cancer analysis project. Nature genetics, 45(10), pp.1113–20. Available at: http://www.nature.com/ng/journal/v45/n10/abs/ng.2764.html.
- Weiskopf, N.G. & Weng, C., 2013. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. Journal of the American Medical Informatics Association, 20(1), pp.144–151. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2011-000681.
- Westfall, J.M., Mold, J. & Fagnan, L., 2007. Practice-based research--“Blue Highways” on the NIH roadmap. JAMA, 297(4), pp.403–6. Available at: http://jama.jamanetwork.com/article.aspx?doi=10.1001/jama.297.4.403.
- Wilhelm, M. et al., 2014. Mass-spectrometry-based draft of the human proteome. Nature, 509(7502), pp.582–7. Available at: http://www.nature.com/doifinder/10.1038/nature13319.
- Wishart, D.S. et al., 2013. HMDB 3.0--The Human Metabolome Database in 2013. Nucleic acids research, 41(Database issue), pp.D801-7. Available at: http://www.ncbi.nlm.nih.gov/pubmed/23161693.
- Wynden, R. et al., 2010. Ontology mapping and data discovery for the translational investigator. AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science, 2010, pp.66–70. Available at: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3041530&tool=pmcentrez&rendertype=abstract.
- Xu, H. et al., 2015. Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality. Journal of the American Medical Informatics Association : JAMIA, 22(1), pp.179–91. Available at: https://academic.oup.com/jamia/article-lookup/doi/10.1136/amiajnl-2014-002649.
- Zheng, Y.-L. et al., 2014. Unobtrusive Sensing and Wearable Devices for Health Informatics. IEEE Transactions on Biomedical Engineering, 61(5), pp.1538–1554. Available at: http://ieeexplore.ieee.org/document/6756983/.
- Zlotta, A.R., 2013. Words of wisdom: Re: Genome sequencing identifies a basis for everolimus sensitivity. European urology, 64(3), p.516. Available at: http://dx.doi.org/10.1016/j.eururo.2013.06.031.