Thursday, March 8, 2012

Sum up a column in a tab delimited file

Suppose we have a tab-delimited file:

kokonech@ultor:~/playgrnd/densityAnalysis/density_repeats.64.LTRs.bed$ head 24h-i-input.bam.bed
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 1 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 2 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 3 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 4 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 5 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 6 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 7 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 8 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 9 0
1 83886031 83886750 ERVL-E-int 980 -1 N 187 LTRs 10 0


We want to sum up a column, let's say the 11th.
A piece of cake using Awk:

awk '{a+=$11}END{printf "%i\n",a}' 24h-i-input.bam.bed