我想检查第1列中的数字是否等于第2列,最后第1列应以"ABC"
开头,以"DEF"
结尾,但有时也会以{{1}结尾"DEFZ#"
或"ABC"######"DEF"
之间的数字应与第二列匹配。请有人帮我。
我的意见
"DEFZ#"
输出应为:
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC95678DEF|45678|23132331331|
ABC87887DEF|86187|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
我试图使用以下内容,但它无效
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
有人可以帮我吗? 提前致谢
答案 0 :(得分:0)
awk -v FS="|" '{tmpvar=$1;gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)}tmpvar == $2' infile
<强>输入强>
akshay@db-3325:/tmp$ cat infile
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC95678DEF|45678|23132331331|
ABC87887DEF|86187|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
<强>输出强>
akshay@db-3325:/tmp$ awk -v FS="|" '{tmpvar = $1; gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)} tmpvar == $2' infile
ABC12345DEF|12345|23132331331|
ABC12345DEFZ1|12345|23132331331|
ABC12345DEFZ2|12345|23132331331|
ABC89043DEF|89043|23132331331|
ABC89043DEFZ1|89043|23132331331|
ABC89043DEFZ2|89043|23132331331|
ABC89043DEFZ3|89043|23132331331|
<强>解释强>
awk -v FS="|" '{ # call awk set field separator |
tmpvar = $1; # save first field contents in variable tmpvar
# substitute first ABC or DEF
# which can be followed by Z and numbers
# from variable with null globally
# so that tmpvar will just have numbers which is between abc and def*
gsub(/^ABC|DEF(Z[0-9]+)?$/,"",tmpvar)
}
# if tmpvar is equal to second field then
# print current record/row/line, thats boolean true, print $0
tmpvar == $2
' infile
/^ABC|DEF(Z[0-9]+)?/
第一替代^ABC
^
断言字符串ABC
开头的位置字面匹配字符ABC
(区分大小写)
第二个替代DEF(Z[0-9]+)?
DEF
字面匹配字符DEF
(区分大小写)第一个捕获组(Z[0-9]+)?
?
量词 - 零和之间的匹配一次,尽可能多次,根据需要回馈(贪婪)Z
字面匹配字符Z
(区分大小写)匹配下面列表中的单个字符[0-9]+
+
量词 - 在一次和无限次之间匹配,尽可能多次,根据需要回馈(贪婪)