Question

我有几百个日志文件用于成功的作业和一些不成功的作业。所有不成功的作业都有“不可翻译”这个词，所以我可以使用这个命令轻松找到所有文件。

grep untranslatable *

所以现在我找到了所有这些文件，我试图找出这些文件的共同点，但我也试图排除成功日志文件中存在的所有行。

我试过这些：

diff --changed-group-format='%<' --unchanged-group-format='' 20160120142000_xxx_xxx_xxx_xxx_fexp.log 20151214153516_yy_yyy_yyy_yyy_yyy_yyy_hist.dat.log | fgrep -x -f 20160120142000_xxx_xxxx_xxx_xxxx_xxx.log 20150904115502_zzz_zzzz_zzzzz_zzzz_fexp.log | grep untranslatable

diff --changed-group-format='%<' --unchanged-group-format='' 20160120142000_xxx_xxxx_xxx_xxx_fexp.log 20151214153516_cc_ccc_ccc_cccc_cccc_cccc_cccc.dat.log |grep untranslatable


fgrep -x <(diff --changed-group-format='%<' --unchanged-group-format='' 20160120142000_EMD_APPN_FEE_DETL_fexp.log 20151214153516_TD_EXT_LPS_PROC_MGMT_FORM_hist.dat.log) <(diff --changed-group-format='%<' --unchanged-group-format='' 20150904115502_smr_sale_price_type_fexp.log 20151214153516_TD_EXT_LPS_PROC_MGMT_FORM_hist.dat.log)

如果我的要求可能，最佳路线是什么？我不知道正则表达式，但如果这有用的话，我可能会读到它。

示例：

每个成功或不成功的文件都包含此文本块。

563      ========================================================================
564      =                                                                      =
565      =          Logoff/Disconnect                                           =
566      =                                                                      =
567      ========================================================================
568 **** 14:20:55 UTY6215 The restart log table was not dropped by this task.
569 **** 14:20:57 UTY6212 A successful disconnect was made from the RDBMS.
570 **** 14:20:57 UTY2410 Total processor time used = '0.11 Seconds'
571      .       Start : 14:20:23 - WED JAN 20, 2016
572      .       End   : 14:20:57 - WED JAN 20, 2016
573      .       Highest return code encountered = '12'i.

我不想看到这一点，因为它与其他类型的块一起使得很难找到有问题的行。

每个不成功的文件都包含此内容，但是：

14:20:54 UTY8713 RDBMS failure, 6706: The string contains an untranslatable
560      character.

但是，仅凭这一点还不足以找到问题所在。这些日志各600条。我需要找出我在日志中尝试读取这个不可翻译字符的位置，以便我可以相应地更改我的查询。滤除噪声后，更容易读取日志。（我正在阅读的文件是数百万行，所以我不想看那里）

我意识到我可能会在这里要求一个神奇的技巧。

我真的不想透露有关这些日志的太多信息，所以方法已经足够了。我可以搞清楚。

谢谢，

马兹

Answer 1

你可以用这个

find . -name "*.log" -type f -exec grep -n -l "untranslatable" {} \;

这将显示包含＆＃34;不可翻译的所有文件＆＃34;字。

此致克劳迪奥

Answer 2

如果要使用成功日志中显示的行过滤掉不成功日志中的行，则需要创建包含过滤器信息的文件。
从大量成功的日志开始：cp log.ok filter.txt 在filter.txt中，您将拥有在过滤时不匹配的行号和ID。因此，编辑您的filter.txt，以便在匹配期间可以使用清理的行：翻译

563      ========================================================================
564      =                                                                      =
565      =          Logoff/Disconnect                                           =
566      =                                                                      =
567      ========================================================================
568 **** 14:20:55 UTY6215 The restart log table was not dropped by this task.
569 **** 14:20:57 UTY6212 A successful disconnect was made from the RDBMS.
570 **** 14:20:57 UTY2410 Total processor time used = '0.11 Seconds'
571      .       Start : 14:20:23 - WED JAN 20, 2016
572      .       End   : 14:20:57 - WED JAN 20, 2016
573      .       Highest return code encountered = '12'i.

到

========================================================================
=                                                                      =
=          Logoff/Disconnect                                           =
The restart log table was not dropped by this task.
A successful disconnect was made from the RDBMS.
Total processor time used =
.       Start : 
.       End   :
.       Highest return code encountered =

现在开始使用grep -vf filter.txt log.nok

进行测试

使用diff和fgrep在日志中查找类似的错误

2 个答案: