我正在尝试在Apache日志文件中搜索与特定漏洞扫描相关的特定条目。我需要将来自单独文件的字符串与weblog中的URI内容进行匹配。我试图找到的一些字符串包含重复的特殊字符,如'?'。
例如,我需要能够匹配仅包含字符串''????????'的攻击但我不想被警告字符串'??????????????????因为每次攻击都与特定的攻击ID号相关联。因此,使用:
if attack_string in log_file_line:
alert_me()
......不行。因此,我决定将字符串放入正则表达式中:
if re.findall(r'\%s' % re.escape(attack_string),log_file_line):
alert_me()
...因为包含字符串'????????'的任何日志文件行无法正常工作即使超过8'也匹配?'在日志文件行中。
然后我尝试为正则表达式添加边界:
if re.findall(r'\\B\%s\\B' % re.escape(attack_string),log_file_line):
alert_me()
...在两种情况下都停止了匹配。我需要能够动态分配我正在寻找的字符串,但我不想匹配任何包含字符串的行。我怎么能做到这一点?
答案 0 :(得分:1)
怎么样:
(?:[^?]|^)\?{8}(?:[^?]|$)
<强>解释强>
(?-imsx:(?:[^?]|^)\?{8}(?:[^?]|$))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
[^?] any character except: '?'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
\?{8} '?' (8 times)
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
[^?] any character except: '?'
----------------------------------------------------------------------
| OR
----------------------------------------------------------------------
$ before an optional \n, and the end of
the string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------