The Human RBPome: From Genes and Proteins to Human Disease
Date
Embargo Lift Date
Department
Committee Members
Degree
Degree Year
Department
Grantor
Journal Title
Journal ISSN
Volume Title
Found At
Abstract
RNA Binding Proteins (RBPs) play a central role in mediating post transcriptional regulation of genes. However, less is understood about them and their regulatory mechanisms. In this study, we construct a repertoire of 1344 genes encoding RBPs identified from several experimental studies and present a comprehensive analysis to understand their characteristics at a global scale. The domain architecture of RBPs enabled us to classify them into three groups - Classical (29%), Non-classical (19%) and Unclassified (52%). A higher percentage of proteins with unclassified domains reveals the presence of various uncharacterized motifs that can potentially bind RNA. In addition, enrichment of various unconventional superfamilies' suggest that RBPs could form an integral part of the cellular architecture. RBPs were found to be highly disordered compared to non-RBPs (p<2.2e-16, Fisher's exact test), indicating a dynamic regulatory role of RBPs in cellular functioning. Evolutionary analysis in 62 different species showed that RBPs are highly conserved compared to non-RBPs (p<2.2e-16, Wilcox-test), reflecting a conservation of various biological processes like mRNA splicing, ribosome biogenesis. Expression patterns of RBPs from human proteome map revealed that majority (~60%) of the RBPs are tissue-specific. Additionally, non-classical proteins were found to be highly expressed than the classical proteins (p<0.05, Wilcox test) in ~50% of the tissues. RBPs were also seen to be highly associated with several neurological disorders, cancer and inflammatory diseases. Further, anatomical context like B cells, T-cells, Fetal Liver and Fetal Brain were found to be enriched, implying a prominent role of RBPs in mediating immune responses and different developmental stages. These analyses are made accessible to researchers in the form of a database called RNA Binding protein expression and disease dynamics database (READ DB).