Translational Bioinformatics / Integrative Proteomics Research
In high school, I started a research project under Dr. Atul Butte and Dr. Marina Sirota of the Butte Lab at the Stanford University Division of Systems Medicine to elucidate the relationship between protein abundance and gene expression in human tissue. The lab has since moved to the UCSF Institute for Computational Health Sciences.
My work involved studying high-throughput bioinformatics approaches to mass spectrometry proteomics and RNA sequencing (RNA-Seq) transcriptomics. I conducted large-scale statistical meta-analysis using R to consolidate and interrogate vast libraries of disparate data sets, reasoning over 49,000,000 points of molecular data.
I published my work in May of 2016 in the Scientific Reports Journal of the Nature Publishing Group. The full text is available at http://www.nature.com/articles/srep24799
http://www.ncbi.nlm.nih.gov/pubmed/27142790
Article Title: Cross-tissue Analysis of Gene and Protein Expression in Normal and Cancer Tissues
Authors: Idit Kosti, Nishant Jain, Dvir Aran, Atul J. Butte, Marina Sirota
Published: May 4th, 2016
Abstract:
The central dogma of molecular biology describes the translation of genetic information from mRNA to protein, but does not specify the quantitation or timing of this process across the genome. We have analyzed protein and gene expression in a diverse set of human tissues. To study concordance and discordance of gene and protein expression, we integrated mass spectrometry data from the Human Proteome Map project and RNA-Seq measurements from the Genotype-Tissue Expression project. We analyzed 16,561 genes and the corresponding proteins in 14 tissue types across nearly 200 samples. A comprehensive tissue- and gene-specific analysis revealed that across the 14 tissues, correlation between mRNA and protein expression was positive and ranged from 0.36 to 0.5. We also identified 1,012 genes whose RNA and protein expression was correlated across all the tissues and examined genes and proteins that were concordantly and discordantly expressed for each tissue of interest. We extended our analysis to look for genes and proteins that were differentially correlated in cancer compared to normal tissues, showing higher levels of correlation in normal tissues. Finally, we explored the implications of these findings in the context of biomarker and drug target discovery.
Full Text:
My work involved studying high-throughput bioinformatics approaches to mass spectrometry proteomics and RNA sequencing (RNA-Seq) transcriptomics. I conducted large-scale statistical meta-analysis using R to consolidate and interrogate vast libraries of disparate data sets, reasoning over 49,000,000 points of molecular data.
I published my work in May of 2016 in the Scientific Reports Journal of the Nature Publishing Group. The full text is available at http://www.nature.com/articles/srep24799
http://www.ncbi.nlm.nih.gov/pubmed/27142790
Article Title: Cross-tissue Analysis of Gene and Protein Expression in Normal and Cancer Tissues
Authors: Idit Kosti, Nishant Jain, Dvir Aran, Atul J. Butte, Marina Sirota
Published: May 4th, 2016
Abstract:
The central dogma of molecular biology describes the translation of genetic information from mRNA to protein, but does not specify the quantitation or timing of this process across the genome. We have analyzed protein and gene expression in a diverse set of human tissues. To study concordance and discordance of gene and protein expression, we integrated mass spectrometry data from the Human Proteome Map project and RNA-Seq measurements from the Genotype-Tissue Expression project. We analyzed 16,561 genes and the corresponding proteins in 14 tissue types across nearly 200 samples. A comprehensive tissue- and gene-specific analysis revealed that across the 14 tissues, correlation between mRNA and protein expression was positive and ranged from 0.36 to 0.5. We also identified 1,012 genes whose RNA and protein expression was correlated across all the tissues and examined genes and proteins that were concordantly and discordantly expressed for each tissue of interest. We extended our analysis to look for genes and proteins that were differentially correlated in cancer compared to normal tissues, showing higher levels of correlation in normal tissues. Finally, we explored the implications of these findings in the context of biomarker and drug target discovery.
Full Text:
Nishant Jain, Monta Vista High School, Yale University