如何通过多个键查找匹配项并提取连续两行匹配项的值?

时间:2019-11-14 14:46:30

标签: bash awk sed grep

我有一个带有以下几行的文件

key1=value1 AND key2=value2 followed by some other text
key3 {value3} some text key4 - value4 and key5 - value5

是否可以提取值1,3、4和5并打印?请注意,仅在连续两行都包含所有匹配项的情况下考虑匹配项。如果可以轻松完成,我就会知道我要寻找的键。

输出(或类似的东西)

key1 = value1, key3=value3, key4 = value4, key5 = value5

示例1-

abc = 12ty3 AND jfk = 345 followed by some other text
klm {678er} some text plr - 567 and deg - 345

输出

abc = 12ty3, klm = 678er, plr = 567 , deg = 345

示例2-

xyz-232  abc = 126y3 AND jfk = 567 followed by some other text dre {567x}
klm {rtyyr} some text plr - 444 and deg - 555 some text 345 = uut

输出

abc = 126y3, klm = rtyyr, plr = 444, deg = 555 

2 个答案:

答案 0 :(得分:1)

只需将其与适当的正则表达式匹配即可。

例如GNU sed(对于posixish sed只需将\+替换为\{1,\})如下:

sed 'N;s/\([^ ]*[ ]\+\)\{0,1\}\([^ =]\+\)[ ]*=[ ]\{0,1\}\([^ ]\+\) [^ ]* \([^ =]\+\)[ ]*=[ ]*\([^ ]\+\)[^\n]*\n\([^ ]\+\) {\([^}]\+\)}.* [^ ]\+ - [^ ]\+ .* \([^ ]\+\) - \([^ ]\+\).*/\2 = \3, \4 = \5, \6 = \7, \8 = \9/' <<EOF
key1=value1 AND key2=value2 followed by some other text
key3 {value3} some text key4 - value4 and key5 - value5
abc = 12ty3 AND jfk = 345 followed by some other text
klm {678er} some text plr - 567 and deg - 345
xyz-232  abc = 126y3 AND jfk = 567 followed by some other text dre {567x}
klm {rtyyr} some text plr - 444 and deg - 555 some text 345 = uut
EOF

似乎可以正常工作并生成以下输出:

key1 = value1, key2 = value2, key3 = value3, key5 = value5
abc = 12ty3, jfk = 345, klm = 678er, deg = 345
abc = 126y3, jfk = 567, klm = rtyyr, deg = 555

答案 1 :(得分:0)

我将使用grep -o提取每个事件在自己的行上,sed将需要的内容重新格式化,并使用paste将它们重新合并为一行:

grep -Eo '\w+\s*=\s*\w+|\w+\s+\{[^}]+\}|\w+\s+-\s+\w+' | sed -E 's/-/=/;s/\{([^}]+)}/= \1/' | paste -sd ','

您可以try it here