Genome-wide association study between
DSE polymorphism and Poly-A usage in
Human population
Hiren Karathia
Sridhar Hannenhalli
Transcription & Polyadenylation (Poly-A)
Objectives
•
Genome-wide estimation of alternate Poly-A (PA) usage
on 3’UTR
•
Genome-wide Prediction and investigation of
polymorphisms in DSE (Downstream Sequence Element)
motifs
•
Population-wide correlation study between the PA
usage and DSE polymorphisms
Annotation status of Poly-A sites on 3’UTR of
Human Genome (hg19 – 2009)
37% - Multiple Poly-A points
Target of the
analysis
RNA-Seq processing for Human Samples
Sample
Fastq files
BWA
Samtools
BAM file
Merged BAM file
Samtools
Samtools
Sorted BAM file
De-duplicated file
Picard tool
Indexing the BAM
Samtools
SAM file
Calculate Coverage
Bed tools
Calculate Relative usage of PAs
Python script
Symbol
Group of Samples
Male
Female
DNA
RNA
BR
British in England and Scotland
1
1
FI
Finnish in Finland
1
1
UT
Utah residents with Northern and Western European ancestry
1
1
YO
Yoruba in Ibadan, Nigeria
1
1
Differential Expression of UTR
Cuffdiff tools
Python script
De-novo assembly
Genome-wide estimation of alternate Poly-A
(PA) usage on 3’UTR
PA1 Coverage
PA2 Coverage
PA1 Junction
PA2 Junction
Complete UTR coverage
Coverage (Stop codon – PA1 junction) / Distance
PA1 Usage =
Complete (complete 3’ UTR) / Distance
Coverage (Stop codon – PA1 junction) / Distance
PA1 Usage =
Complete (complete 3’ UTR) / Distance
Coverage (Stop codon - PA2 junction) / Distance
PA2 Usage =
Coverage (complete 3’UTR) / Distance
Coverage (Stop codon - PA2 junction) / Distance
PA2 Usage =
Coverage (complete 3’UTR) / Distance
Stop Codon
Cleaved 3’UTR
Prediction of DSE
Coding Strand of DNA
Sample A
RNA-Seq
Sample A
DNA-Seq
De-novo assembled 3’UTR fragment
Prediction of DSE motif
Template Strand of DNA
Frequency of Poly-A usage in the samples
Correlation of different PA usage in a Human
Sample
PA1 – PA2
PA2 – PA3
r = - 0.643; p = 0.0
r = - 0.182; p = 1.06e
-33
Correlation of PA usage and corresponding
DSE polymorphism
Correlation of PA usage and corresponding DSE
polymorphism
Functional enrichment of Genes associated
with Differential PA Usage and Polymorphic for
of DSEs in Population
Thank you !!
Differential Expression of complete 3’UTR
Inter/Intra group correlation of a PA usage
r = 0.8; p = 0.0
r = 0.8; p = 0.0
r = 0.98; p = 0.0
PA1 usage
BR1 – BR2
FN1 – FN2
BR1 – FN1
Statistics of predicted DSE motifs
Sample
PA type
Mean(Motif Length)
Max(Motif Length)
Min(Motif Length)
Mean(Distance)
Max(Distance)
Min(Distance)
BR-1
Single
12
79
9
30
89
1
Multiple
12
52
9
34
89
1
BR-2
Single
12
62
9
31
89
1
Multiple
12
52
9
34
89
1
FN - 1
Single
12
90
9
35
89
1
Multiple
12
54
9
39
89
1
Find Polymorphism in the DSEs
Find Correlation between the PA-usage and
DSE polymorphism
Pending
Alternate Poly-A selection mechanism
Complete 3’UTR coverage
VS
Alternate 3’UTR coverage
Differential expression of complete 3’UTR usage
Differential expression of PA Usage
Poly Adenylation Usage on 3’UTR
PA1 Coverage
PA2 Coverage
PA1 Junction
PA2 Junction
Complete UTR coverage
PA1 Coverage
Relative PA1 Usage =
Longest UTR Coverage
PA1 Coverage
Relative PA1 Usage =
Longest UTR Coverage
PA2 Coverage
Relative PA2 Usage =
Longest UTR Coverage
PA2 Coverage
Relative PA2 Usage =
Longest UTR Coverage
Stop Codon
Intron
Cleaved 3’UTR
DSE statistic
Sample
PA type
Mean(Motif Length)
Max(Motif Length)
Min(Motif Length)
Mean(Distance)
Max(Distance)
Min(Distance)
BR-1
Single
12
79
9
30
89
1
Multiple
12
52
9
34
89
1
BR-2
Single
12
62
9
31
89
1
Multiple
12
52
9
34
89
1
FN - 1
Single
12
90
9
35
89
1
Multiple
12
54
9
39
89
1
+ strand
- strand
Gene Strand
Template Strand
+ Read
+ Read
+ Read
- Read
- Read
RNA Strand
DNA Strand
Locations of annotated multiple PA locations on 3’UTR
PA1 Junction
PA2 Junction
Stop Codon
Cleaved 3’UTR
PA1 Junction
PA2 Junction
Stop Codon
PAs on same exon
PAs on multiple exons
r = 0.2578
p = 8.44e10
-111
Poly-A Location
Length of 3’UTR