After performing alignment of RNA-seq or exome sequencing data it is quite important to compute coverage. There are several ways how to perform this task. However, there is one additional interesting issue: check if there is any coverage outside of exon regions. For example, in case of RNA-seq this allows to show if there are some previously unknown transcripts expressed. In this post I will describe how to compute out-of-region coverage from BAM file.
To perform out-of-region coverage check two files should be available:
- alignment data in SAM/BAM format. Sample: kidney.bam.
- gene annotations in BED format. Sample: transcripts.human.64.bed.
I will describe two methods to perform this task. Of course, I will recommend second method, using my favorite tool ;)
First of all, a file listing the coordinates of regions outside of genes from annotation file is required. This can be done using command complmentBed:
After, the creating out-of-region annotation file, coverage can be computed:
More details about BEDtools can be found here.
Qualimap BAM QC mode supports performing analysis of a BAM file within regions from annotation file in BED/GFF/GTF formats. Additionally, there is an option to perform additional analysis in "out of regions" block. The options is called "Analyze outside regions" , more details in documentation. So, here's an example:
Qualimap can be downloaded from here.
Well, that's enough for now. Have fun ;)
P.S. Thanks a lot to Tristan Carland for reporting a bug in QualiMap when computing coverage out of regions. In version 2.1 the bug is fixed.
To perform out-of-region coverage check two files should be available:
- alignment data in SAM/BAM format. Sample: kidney.bam.
- gene annotations in BED format. Sample: transcripts.human.64.bed.
I will describe two methods to perform this task. Of course, I will recommend second method, using my favorite tool ;)
Using BEDtools
First of all, a file listing the coordinates of regions outside of genes from annotation file is required. This can be done using command complmentBed:
$~/tools/BEDTools/bin/complementBed -i transcripts.human.64.bed -g hg19.fa.fai > transcripts.human.64.outside.bed
After, the creating out-of-region annotation file, coverage can be computed:
$~/tools/BEDTools/bin/coverageBed -abam kidney.bam -b transcripts.human.64.outside.bed -d | awk '{c+=$5;len+=1}END{print "mappedBases=" c ,"RegionsSize=" len, "meanCoverage=" c/len}'
More details about BEDtools can be found here.
Using Qualimap
Qualimap BAM QC mode supports performing analysis of a BAM file within regions from annotation file in BED/GFF/GTF formats. Additionally, there is an option to perform additional analysis in "out of regions" block. The options is called "Analyze outside regions" , more details in documentation. So, here's an example:
$ qualimap bamqc -bam kidney.bam -gff annotations/transcripts.human.64.bed -os --java-mem-size=4G
Qualimap can be downloaded from here.
Well, that's enough for now. Have fun ;)
P.S. Thanks a lot to Tristan Carland for reporting a bug in QualiMap when computing coverage out of regions. In version 2.1 the bug is fixed.
No comments:
Post a Comment