#ComputationalMotifAnalysis
Computational investigation of the sequence context of arginine/glycine-rich motifs in the human proteome - BMC Genomics
Arginine-glycine (RG)-rich motifs are among the most prevalent RNA-binding elements within intrinsically disordered regions (IDRs) of proteins and play crucial roles in RNA metabolism, gene regulation, and the formation of membraneless organelles via liquid phase separation (LLPS). Despite their biological relevance and implication in neurological disorders and cancer, the sequence features and context dependencies that define functional RG motifs remain poorly characterized owing to their disordered nature and sequence variability. In this study, we present a computational framework to dissect the sequence and structural context of RG motifs across the human proteome. By contrasting a functionally defined positive dataset—enriched for RNA-binding and phase-separating proteins—with a negative dataset of RG motif proteins lacking these annotations, we identified distinct compositional and contextual signatures. RG motifs in the functionally defined positive dataset show increased enrichment of phenylalanine, tyrosine, aspartic acid, and asparagine, both within and around the motif, as well as nonrandom spatial relationships with structured RNA-binding domains. Notably, phenylalanine and tyrosine exhibit divergent positional and functional profiles, suggesting distinct mechanistic roles. Our analysis highlights the potential of sequence-based approaches to uncover functional determinants in disordered protein regions and further advances our understanding of the properties of RG motifs, offering a transferable framework for the study of other low-complexity motifs.
bmcgenomics.biomedcentral.com
October 8, 2025 at 9:01 AM