我有一个制表符分隔的文本文件,如下所示:
27 1 hom het:het het,het,het,het
18 1 hom het:het hom,het,het,het,het,het,het
29 1 hom het:het hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom,hom
13 1 hom het:het het,het,het,het,het,het
21 1 hom het:het hom,het,het,het,het,het,hom,het,hom,het,het,het,hom
25 1 hom het:het het,hom,het,het,het
29 1 hom het:het hom,hom,het,hom,het,het,hom,het,het,hom,het,hom,het,hom
18 1 hom het:het het,het,het
19 1 hom het:het het,het,hom,het,het,het,het,het,het,hom,het,het,hom,het
我想排除第5列中有'hom'的行。即输出应如下所示:
27 1 hom het:het het,het,het,het
13 1 hom het:het het,het,het,het,het,het
18 1 hom het:het het,het,het
使用unix命令的任何帮助?
答案 0 :(得分:5)
Awk非常适合这个:
$ awk '$5!~/\<hom\>/' file
27 1 hom het:het het,het,het,het
13 1 hom het:het het,het,het,het,het,het
18 1 hom het:het het,het,het
说明:
$5 # is the fifth column
!~ # negated regex match
/ # start regex string
\< # matches the empty string at the beginning of a word.
hom # matches the literal string 'hom'
\> # matches the empty string at the end of a word.
/ # end regex string
答案 1 :(得分:0)
以下是使用sed
sed -r '/(\S+\s+){4}[^\s]*\b(hom)\b/d' file
输出:
27 1 hom het:het het,het,het,het
13 1 hom het:het het,het,het,het,het,het
18 1 hom het:het het,het,het