Velocity - 在包含XML的String中搜索

时间:2017-01-24 09:24:13

标签: regex xml search velocity

我目前正在研究速度模板,该模板应该可以使用XML文件进行配置。

我能够将文件读入变量。为了检查配置,我现在需要在变量中找到某个字符串。在某些情况下,搜索字符串可能是正则表达式。

this thread我知道我可以使用.matches()来搜索RegEx。但无论我尝试什么(参见下面的“测试代码”),我只返回“false”,即使我只是尝试搜索其中一个标签。

<html>
    <body>
## this example is intended to test searching regular expressions

## let's start with a simple example:

#set( $simpleText = "This is the string where I will try to find a substring." )
#set( $searchStr = "string" )

1 $simpleText.matches($searchStr)<BR>   ## this returns false as .matches() only returns true if the parameter $searchStr (could be regular expression) matches the ENTIRE string ($simpleText)

#set( $searchStr = ".*string.*" )       ## .* at the beginning and the end of the search string means any character can be before and after the 'real' search string

2 $simpleText.matches($searchStr)<BR>   ## this returns true, so adding .* at the beginning and the end of the search string seems to work.

## let's now move on to strings containing XML (as this is the real use case)

#set( $xmlText = '<?xml version="1.0"?>
<ItemTypes>
    <ItemType>
        <Display>L1 Items</Display>
        <Fields>
            <FieldLabel>Project ID</FieldLabel>
            <FieldLabel>Name</FieldLabel>
            <FieldLabel>Description</FieldLabel>
            <FieldLabel>Assigned</FieldLabel>
        <Fields>
    </ItemType>
</ItemTypes>' )

3 $xmlText<BR>                          ## when printing a string containing XML tags those tags will not be visible in the printout (probably because they are interpreted as kind of html tags...)

#set( $escapedXmlText = $escapeTool.xml($xmlText) )  ## escapeTool will ensure that the tags will also be printed (visible)

4 $escapedXmlText<BR>                   ## this printout will also display the tags

## let's now try to find the string 'Display' in xmlText the same way as we did in the simple example at the beginning:

#set( $searchStr = '.*Display.*')

5 $xmlText.matches($searchStr)<BR>          ## returns false but WHY?
6 $escapedXmlText.matches($searchStr)<BR>   ## returns false but WHY?

    </body>
</html>

有没有人知道为什么最后的打印输出5和6都返回false?

1 个答案:

答案 0 :(得分:0)

我想我自己找到了答案,虽然我并不完全确定(很高兴得到任何反馈)。 在我的测试示例下面扩展了几行(也包含我的发现的评论):

<html>
    <body>
## this example is intended to test searching regular expressions

## let's start with a simple example:

#set( $simpleText = "This is the string where I will try to find a substring." )
#set( $searchStr = "string" )

1 $simpleText.matches($searchStr)<BR>   ## this returns false as .matches() only returns true if the parameter $searchStr (could be regular expression) matches the ENTIRE string ($simpleText)

#set( $searchStr = ".*string.*" )     ## .* at the beginning and the end of the search string means any character can be before and after the 'real' search string

2 $simpleText.matches($searchStr)<BR>   ## this returns true, so adding .* at the beginning and the end of the search string seems to work.

## let's now move on to strings containing XML (as this is the real use case)

#set( $xmlText = '<?xml version="1.0"?>
<ItemTypes>
    <ItemType>
        <Display>L1 Items</Display>
        <Fields>
            <FieldLabel>Project ID</FieldLabel>
            <FieldLabel>Name</FieldLabel>
            <FieldLabel>Description</FieldLabel>
            <FieldLabel>Assigned</FieldLabel>
        <Fields>
    </ItemType>
</ItemTypes>' )

3 $xmlText<BR>                          ## when printing a string containing XML tags those tags will not be visible in the printout (probably because they are interpreted as kind of html tags...)

#set( $escapedXmlText = $escapeTool.xml($xmlText) )  ## escapeTool will ensure that the tags will also be printed (visible)

4 $escapedXmlText<BR>                   ## this printout will also display the tags

## let's now try to find the string 'Display' in xmlText the same way as we did in the simple example at the beginning:

#set( $searchStr = '.*Display.*')

5 $xmlText.matches($searchStr)<BR>          ## returns false, obviously because . does not match "line terminators" (cf. https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html)
6 $escapedXmlText.matches($searchStr)<BR>   ## also returns false

## on https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#lt you can find the following info:
## "The regular expression . matches any character except a line terminator unless the DOTALL flag is specified."
## https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#DOTALL says:
## "Dotall mode can also be enabled via the embedded flag expression (?s)."
## and https://kodejava.org/how-do-i-write-embedded-flag-expression/ finally says that embedded flag expression are to be provided at the beginning of the regex.
## So, let's now try (this time also including some special characters like '<', '>', '/'):

#set( $searchStr = '(?s).*<Display>L1 Items</Display>.*')

7 $xmlText.matches($searchStr)<BR>          ## FINALLY RETURNS TRUE!!
8 $escapedXmlText.matches($searchStr)<BR>   ## still return false as in the escaped XML special characters like '<' are replaced/escaped

    </body>
</html>

所以,用(?s)启动正则表达式似乎可以解决问题!