我有一个包含许多列和行的文件,我想删除第四列和第五列中多个字符的行。
输入:
--- 22:16050115:G:A 16050115 GGG A
--- 22:16050213:C:T 16050213 C T
--- 22:16050319:C:T 16050319 C T
--- 22:16050527:C:A 16050527 C AAA
--- 22:16050568:C:A 16050568 CC A
--- 22:16050607:G:A 16050607 G A
--- 22:16050627:G:T 16050627 G TGG
--- 22:16050646:G:T 16050646 G T
--- 22:16050655:G:A 16050655 GTAA A
...
期望的输出:
--- 22:16050213:C:T 16050213 C T
--- 22:16050319:C:T 16050319 C T
--- 22:16050607:G:A 16050607 G A
--- 22:16050646:G:T 16050646 G T
...
非常感谢。
答案 0 :(得分:4)
awk 'length($4)==1 && length($5)==1' inputfile
--- 22:16050213:C:T 16050213 C T
--- 22:16050319:C:T 16050319 C T
--- 22:16050607:G:A 16050607 G A
--- 22:16050646:G:T 16050646 G T
这将使用$4
的{{1}}函数检查$5
和length()
的长度。这是使用比较运算符awk
。您可以将其修改为==
,<
,>
等。因此,上面的命令将打印第4和第5列中只有一个字符的行。