我已经在这里得到了一些帮助,但我的问题略有不同。我正在寻找创建DocumentBuilderFactory
但未限制ExpandEntityReferences
的情况。我有以下正则表达式:
(?x)
# finds DocumentBuilderFactory creation and pulls out the variable name
# of the form DocumentBuilderFactory VARNAME = DocumentBuilderFactory.newInstance
# then checks if that variable name has one of three acceptable ways to stop XXE attacks
# matches any instance where the variable is initialized, but not restricted
(?:
# This is for DocumentBuilderFactory VARNAME = DocumentBuilderFactory.newInstance with many possible alternates
DocumentBuilderFactory
[\s]+?
(\w+)
[\s]*?
=
[\s]*?
(?:.*?DocumentBuilderFactory)
[.\s]+
newInstance.*
# checks that the var name is NOT (using ?!) using one of the acceptable rejection methods
(?!\1[.\s]+
(?:setFeature\s*\(\s*"http://xml.org/sax/features/external-general-entities"\s*,\s*false\s*\)
|setFeature\s*\(\s*"http://apache.org/xml/features/disallow-doctype-decl"\s*,\s*false\s*\)
|setExpandEntityReferences\s*\(\s*false\s*\))
)
)
并且测试文件可能如下所示:
// Set the parser properties
javax.xml.parsers.DocumentBuilderFactory factory =
javax.xml.parsers.DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
factory.setValidating(false);
factory.setExpandEntityReferences(false);
factory.setIgnoringComments(true);
factory.setIgnoringElementContentWhitespace(true);
factory.setCoalescing(true);
javax.xml.parsers.DocumentBuilder builder = factory.newDocumentBuilder();
有没有办法在此文件上运行此正则表达式并且正则表达式失败(因为它正确设置了factory.setExpandEntityReferences(false);
?
更新:
(?:
DocumentBuilderFactory
\s+
(\w+)
\s*
=
\s*
(?:.*?DocumentBuilderFactory)
\s*.\s*
newInstance.*
(?:[\s\S](?!
\1\s*.\s*
(?:setFeature\s*\(\s*"http://xml.org/sax/features/external-general-entities"\s*,\s*false\s*\)
|setFeature\s*\(\s*"http://apache.org/xml/features/disallow-doctype-decl"\s*,\s*false\s*\)
|setExpandEntityReferences\s*\(\s*false\s*\))
))*$
)
并没有像预期的那样成功找到();但是,如果我将factory.setExpandEntityReferences(false)拼错为factory.setExpandEntity ## References(false)我希望找到正则表达式,但事实并非如此。有没有办法让这个功能起作用?
答案 0 :(得分:3)
(?:.(?!xyz))*$
它基本上意味着,“从这一点开始,每个角色都必须 后面跟xyz
。”由于.
与新行不匹配,因此您可能希望将其概括为:
(?:[\s\S](?!xyz))*$
^^^^^^
(它是互补集的联合,因此真正所有字符。)
要将此应用于您的案例,只需将xyz
替换为您不希望出现在任何地方的内容:
# checks that the var name is NOT (using ?!) using one of the acceptable rejection methods
(?:[\s\S](?!
\1[.\s]+
(?:setFeature\s*\(\s*"http://xml.org/sax/features/external-general-entities"\s*,\s*false\s*\)
|setFeature\s*\(\s*"http://apache.org/xml/features/disallow-doctype-decl"\s*,\s*false\s*\)
|setExpandEntityReferences\s*\(\s*false\s*\))
))*$
当然,在使用factory
时,您不希望与old_factory
匹配!使用单词边界确保您捕获整个单词。
在您的情况下,只需在\b
之前添加\1
:
\b\1
正如评论中所述,\s
包括\r
和\n
,因此您可以将[\s\r\n]
重写为\s
(不带括号)。
此外,您还想更改
等实例newInstance.*
到
newInstance[.]*
通配符不在字符类中的行为类似于\s
或\w
:.
只是表示字符类中的文字点。