使用awk进行字符串匹配

时间:2014-04-24 02:26:40

标签: regex linux awk substring

我有一个制表符分隔文件,其中包含以下行:

field1 field2 field3 field4 field5 field6
1 abc 2 word:add,word:remove text string
2 xyz 2 word:replace,word:modify msg string
3 lmn 1 word:add msg numeric
4 cncn 2 phone:add,phone: remove msg numeric
5 lmn 2 word:add msg text

我想写一个awk程序/ oneliner给我行

field3 ==2field4 contains either "add" or "remove"

换句话说,它应该首先过滤掉这些,

1 abc 2 word:add,word:remove text string
2 xyz 2 word:replace,word:modify msg string
4 cncn 2 phone:add,phone:remove msg numeric
5 lmn 2 word:add msg text

在第二步中应该过滤掉这些

1 abc 2 word:add,word:remove text string
4 cncn 2 phone:add,phone:remove msg numeric    
5 lmn 2 word:add msg text

我可以使用正确的第一步:cat test.tsv | awk -F '\t' '$3 == 2'

如何匹配第二部分的子串? 提前致谢

1 个答案:

答案 0 :(得分:3)

您可以使用~匹配字段:

awk -F '\t' '$3==2 && $4 ~ /add|remove/' filename

会产生预期的结果:

1 abc 2 word:add,word:remove text string
4 cncn 2 phone:add,phone: remove msg numeric
5 lmn 2 word:add msg text

从手册中引用:

   ~ !~        Regular  expression match, negated match.  NOTE: Do not use
               a constant regular expression (/foo/) on the left-hand side
               of  a  ~  or !~.  Only use one on the right-hand side.  The
               expression /foo/ ~ exp has  the  same  meaning  as  (($0  ~
               /foo/) ~ exp).  This is usually not what was intended.