在下面awk
我尝试将所有匹配的$4
合并为一个$5
(最多-
),并平均{{1}中的所有值}}。为什么$7
抱怨输出没有被伪造(即awk
)。谢谢你:)。
输入(`/home/cmccabe/Desktop/NGS/API/2-12-2015/bedtools/30x/*30reads_perbase.txt')
/home/cmccabe/Desktop/NGS/API/2-12-2015/bedtools/30x/${pref}_genes.txt
所需的输出
chr1 955543 955763 chr1:955543-955763 AGRN-6|gc=75 1 15
chr1 955543 955763 chr1:955543-955763 AGRN-6|gc=75 2 16
chr1 955543 955763 chr1:955543-955763 AGRN-6|gc=75 3 16
chr1 955543 955763 chr1:955543-955763 AGRN-6|gc=75 4 14
chr1 976035 976270 chr1:976035-976270 AGRN-9|gc=74.5 1 28
chr1 976035 976270 chr1:976035-976270 AGRN-9|gc=74.5 2 27
chr1 976035 976270 chr1:976035-976270 AGRN-9|gc=74.5 3 27
AWK
chr1:955543-955763 4 AGRN 15
chr1:976035-976270 3 AGRN 27
当前输出
for f in /home/cmccabe/Desktop/NGS/API/2-12-2015/30x/*30reads_perbase.txt ; do bname=`basename "$f"`; pref=${bname%%.txt}; awk '{k=$4 FS $5; a[k]+=$7; c[k]++}
END{for(k in a)
split(k,ks,FS);
print ks[1],c[k],ks[2],a[k]/c[k]}' "$f" > /home/cmccabe/Desktop/NGS/API/2-12-2015/30x/"${pref}"_genes.txt; done
答案 0 :(得分:2)
使用函数substr并在打印变量时匹配:
猫| awk' {k = $ 4 FS $ 5;一个[K] + = $ 7; c [k] ++} END {for(k in a)split(k,ks,FS); print ks [1],c [k],substr(ks [2],0,match(ks [2] " - ") - 1),[K] / C [k]的}'
chr1:955543-955763 4 AGRN 15.25