unix egrep与特定组不匹配

时间:2018-11-07 17:31:39

标签: regex unix grep grouping

我有这个文本文件

6次:5个大写字母,以空格分隔

EWTLE YQTCE FNTMA YVTMB GWTDH QGTAL
UVGEV SPGWP HDAVZ FLRVY HVBFT OFUSG
UKAYH BOAXR BLUSG YRMZT WAIMR BOCCX
BIUCZ KYUPP ECUZI PIURZ MXUMB RDUIG
ANAZW IVAYI QNHFN UPTHC YACTJ QPRLV 

使用Unix命令行,例如egrep。我需要检查每个组中的中间字符是否相同(此处为第4行)。

如果其中一个字母不相同,我尝试使用一组来反转结果。但是我找不到方法。

egrep -v '[A-Z]{2}([A-Z])[A-Z]{2}.*[A-Z]{2}§NOT GROUP 1§[A-Z]{2}' filename

如何填写§之间的部分?

我的解决方案太长了,只是对每个小组重复同样的事情

egrep '[A-Z]{2}([A-Z])[A-Z]{2} [A-Z]{2}\1[A-Z]{2} 
[A-Z]{2}\1[A-Z]{2} [A-Z]{2}\1[A-Z]{2} [A-Z]{2}\1[A-Z]{2} [A-Z]{2}\1[A-Z]{2}' filename

2 个答案:

答案 0 :(得分:1)

这稍微短一点,并且完成了相同的事情:

[A-Z]{2}([A-Z])[A-Z]{2} ([A-Z]{2}\1[A-Z]{2} ?){5}

在此处进行测试:https://www.regextester.com/

答案 1 :(得分:0)

使用-P选项(PCRE):

grep -P '^([A-Z]{2})([A-Z])(?1)(?: (?1)\2(?1)){5}$' file.txt

输出:

EWTLE YQTCE FNTMA YVTMB GWTDH QGTAL
BIUCZ KYUPP ECUZI PIURZ MXUMB RDUIG

说明:

^               # beginning of line
  ([A-Z]{2})    # group 1, 2 uppercases
  ([A-Z])       # group 2, 1 uppercase
  (?1)          # same pattern as group 1 (i.e. 2 uppercases)
  (?:           # start non capture group
                # 1 space
    (?1)        # same pattern as group 1 (i.e. 2 uppercases)
    \2          # same content as group 2, same letter
    (?1)        # same pattern as group 1 (i.e. 2 uppercases)
  ){5}          # end group, must appear 5 times
$               # end of line