我有一个像这样的文件(maillog):
Feb 22 23:53:39 info postfix[102]: connect from APVLDPDF01[...
Feb 22 23:53:39 info postfix[101]: BA1D7805A1: client=APVLDPDF01[...
Feb 22 23:53:39 info postfix[103]: BA1D7805A1: message-id
Feb 22 23:53:39 info opendkim[139]: BA1D7805A1: DKIM-Signature field added
Feb 22 23:53:39 info postfix[763]: ED6F3805B9: to=<CORREO1@GM.COM>, relay...
Feb 22 23:53:39 info postfix[348]: ED6F3805B9: removed
Feb 22 23:53:39 info postfix[348]: BA1D7805A1: from=<correo@prueba.com>,...
Feb 22 23:53:39 info postfix[102]: disconnect from APVLDPDF01...
Feb 22 23:53:39 info postfix[842]: 59AE0805B4: to=<CO2@GM.COM>,status=sent
Feb 22 23:53:39 info postfix[348]: 59AE0805B4: removed
Feb 22 23:53:41 info postfix[918]: BA1D7805A1: to=<CO3@GM.COM>, status=sent
Feb 22 23:53:41 info postfix[348]: BA1D7805A1: removed
和第二个文件(mailids)如下:
6DBDD8039F:
3B15BC803B:
BA1D7805A1:
2BD19803B4:
我想获得一个包含以下内容的输出文件:
Feb 22 23:53:41 info postfix[918]: BA1D7805A1: to=<CO3@GM.COM>, status=sent
只是ID存在于第二个文件中的行,在本例中只是ID = BA1D7805A1:在文件中。但是还有另一个条件,这条线必须是&#34; ID to =&lt;&#34; 它意味着只包含&#34; to =&lt;&#34;并且可以输出文件二中的ID。
我找到了不同的解决方案,但我对性能有很大的疑问。 maillog文件大小为2GB,大约为1000万行。而mailid文件大约有32000行。
这个过程需要花费太多时间,而且我从没见过它。 我尝试过使用awk和grep命令,但我找不到最好的方法。
答案 0 :(得分:2)
grep -F -f mailids maillog | grep 'to=<'
来自grep
手册页:
-F, --fixed-strings
Interpret PATTERN as a list of fixed strings, separated by
newlines, any of which is to be matched. (-F is specified by
POSIX.)
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file
contains zero patterns, and therefore matches nothing. (-f is
specified by POSIX.)
答案 1 :(得分:1)
最好添加-w
选项
-w, --word-regexp
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be
at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end
of the line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the
underscore.
这是我使用的常用命令。
grep -Fwf mailids maillog |grep 'to=<'
如果ID固定在第6列,请尝试使用单行awk命令
awk 'NR==FNR{a[$1];next} /to=</&&$6 in a ' mailids maillog