awk打印列中特定的字符数

时间:2017-02-27 03:52:31

标签: linux awk

我有一个包含许多列和行的文件,我想删除第四列和第五列中多个字符的行。

输入:

--- 22:16050115:G:A 16050115 GGG A
--- 22:16050213:C:T 16050213 C T
--- 22:16050319:C:T 16050319 C T
--- 22:16050527:C:A 16050527 C AAA
--- 22:16050568:C:A 16050568 CC A
--- 22:16050607:G:A 16050607 G A
--- 22:16050627:G:T 16050627 G TGG
--- 22:16050646:G:T 16050646 G T
--- 22:16050655:G:A 16050655 GTAA A
...

期望的输出:

--- 22:16050213:C:T 16050213 C T
--- 22:16050319:C:T 16050319 C T
--- 22:16050607:G:A 16050607 G A
--- 22:16050646:G:T 16050646 G T
...

非常感谢。

1 个答案:

答案 0 :(得分:4)

awk 'length($4)==1 && length($5)==1' inputfile
--- 22:16050213:C:T 16050213 C T
--- 22:16050319:C:T 16050319 C T
--- 22:16050607:G:A 16050607 G A
--- 22:16050646:G:T 16050646 G T

这将使用$4的{​​{1}}函数检查$5length()的长度。这是使用比较运算符awk。您可以将其修改为==<>等。因此,上面的命令将打印第4和第5列中只有一个字符的行。