如果值在范围内,则awk更新文件

时间:2016-03-08 21:35:43

标签: awk

我有一个file2$5之前的值为{1400} -,其中"未知"。我要做的是使用$2 file2中的文字来更新那些"未知" file1中的值。在$1的{​​{1}}中,有一组数字可用于更新"未知"如果它在file1的{​​{1}}范围内。我真的不知道从哪里开始,但也许下面的$4是一个开始,或者可能有更好的方法。谢谢你:)。

file1

file2

file2的

awk

所需的输出 `$1` `$2` chr6:3224495-3227968 TUBB2B chr16:89988417-90002505 TUBB3 )。

chr16   89985657    89986630    chr16:89985657-89986630 MC1R-2270|gc=63.5
chr16   89989779    89989898    chr16:89989779-89989898 unknown-2271|gc=73.9
chr16   89998969    89999097    chr16:89998969-89999097 unknown-2272|gc=57
chr16   89999866    89999996    chr16:89999866-89999996 unknown-2273|gc=55.4
chr16   90001127    90002222    chr16:90001127-90002222 unknown-2274|gc=63.9
chr17   1173848 1174575 chr17:1173848-1174575   BHLHA9-3|gc=78.7

AWK

unknown updated to TUBB3 because the TUBB3 because the $4 value is within the range of $1

编辑:

chr16   89985657    89986630    chr16:89985657-89986630 MC1R-2270|gc=63.5
chr16   89989779    89989898    chr16:89989779-89989898 TUBB3-2271|gc=73.9
chr16   89998969    89999097    chr16:89998969-89999097 TUBB3-2272|gc=57
chr16   89999866    89999996    chr16:89999866-89999996 TUBB3-2273|gc=55.4
chr16   90001127    90002222    chr16:90001127-90002222 TUBB3-2274|gc=63.9
chr17   1173848 1174575 chr17:1173848-1174575   BHLHA9-3|gc=78.7

1 个答案:

答案 0 :(得分:2)

awk救援!

$ awk -v OFS='\t' 'NR==FNR{split($1,a,/[:-]/)
                           rstart[a[1]]=a[2]
                           rend[a[1]]=a[3]
                           value[a[1]]=$2
                           next} 
     $5~/unknown/ && $2>=rstart[$1] && $3<=rend[$1]
                          {sub(/unknown/,value[$1],$5)}1' file1 file2 | 
  column -t

chr16  89985657  89986630  chr16:89985657-89986630  MC1R-2270|gc=63.5
chr16  89989779  89989898  chr16:89989779-89989898  TUBB3-2271|gc=73.9
chr16  89998969  89999097  chr16:89998969-89999097  TUBB3-2272|gc=57
chr16  89999866  89999996  chr16:89999866-89999996  TUBB3-2273|gc=55.4
chr16  90001127  90002222  chr16:90001127-90002222  TUBB3-2274|gc=63.9
chr17  1173848   1174575   chr17:1173848-1174575    BHLHA9-3|gc=78.7

修改原始间距,以便以表格格式传送到column -t