我想将Today.txt的文件名与Main.txt进行比较。 如果匹配,则打印Main.txt匹配文件的所有6列,新文件为matched.txt。
以及与Main.txt不匹配的文件,然后在新文件中列出TODAY.txt的文件名和时间,例如unmatched.txt
注意:加号(+)表示文件来自inprogress目录,有时文件名附加" +"。
Main.txt
date filename timestamp space count status
Nov 4 +CHCK01_20161104.txt 06:39 2.15M 17153 on_time
Nov 4 TRIPS11_20161104.txt 09:03 0.00M 24 On_Time
Nov 4 AR02_20161104.txt 09:31 0.00M 7 On_Time
Nov 4 AR01_20161104.txt 09:31 0.04M 433 On_Time
Today.txt
filename time
CHCK01_20161104.txt 06:03
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
AR01_20161104.txt 09:36
AR02_20161104.txt 09:36
ifs01_20161104.txt 21:16
TRIPS11_20161104.txt 09:16
所需输出: matched.txt
Nov 4 +CHCK01_20161104.txt 06:39 2.15M 17153 on_time
Nov 4 TRIPS11_20161104.txt 09:03 0.00M 24 On_Time
Nov 4 AR02_20161104.txt 09:31 0.00M 7 On_Time
Nov 4 AR01_20161104.txt 09:31 0.04M 433 On_Time
unmatched.txt
CHCK05_20161104.txt 11:10
CHCK09_20161104.txt 21:46
ifs01_20161104.txt 21:16
下面的命令为我提供了正确的输出,除非文件附加了加号(+)。
awk 'FNR==1{next}
NR==FNR{a[$1]=$2; next}
$3 in a{print; delete a[$3]}
END{for(k in a) print k,a[k] > "unmatched"}' today main > matched
提前多多感谢!
答案 0 :(得分:2)
问题是在$3 in a
文件上运行时的行main
。对于要匹配+
的字符串值,请在gensub
中的$3
操作期间GNU awk
使用gensub
。 gsub
优于$ awk 'FNR==1{next}
NR==FNR{a[$1]=$2; next}
gensub(/+/,"",1,$3) in a{print; delete a[gensub(/+/,"",1,$3)]}
END{for(k in a) print k,a[k] > "unmatched"}' today main
Nov 4 +CHCK01_20161104.txt 06:39 2.15M 17153 on_time
Nov 4 TRIPS11_20161104.txt 09:03 0.00M 24 On_Time
Nov 4 AR02_20161104.txt 09:31 0.00M 7 On_Time
Nov 4 AR01_20161104.txt 09:31 0.04M 433 On_Time
的重要性在于它返回替换值而不是反映在文件上。所以将它用于你的情况
gawk
根据需要在输出中生成4行。
来自gensub(regexp, replacement, how [, target])
gensub is a general substitution function. Like sub and gsub, it
searches the target string target for matches of the regular expression regexp. Unlike sub and gsub,
the modified string is returned as the result of the function, and the original target string
is not changed. If how is a string beginning with `g' or `G', then it replaces all matches
of regexp with replacement.
手册页。
gensub(/+/,"",1,$3)
因此,在我们的情况下,+
仅在字段的开头用空字符串替换第一次出现的1
(因为我们将替换计数设置为awk
)。这是为了避免在现场的任何其他地方进行更换。
(或)更整洁的gsub
逻辑,感谢Ed Morton's建议在$3
上使用$ awk 'FNR==1{next}
NR==FNR{a[$1]=$2; next}
{k=$3; sub(/^\+/,"",k)} k in a{print; delete a[k]}
END{for(k in a) print k,a[k] > "unmatched"}' today main
并将其存储在变量上
{{1}}