Question

我在java中输入了一个完整的html文件作为字符串（我也有文件）。文字如下所示

Sample input
    Some text........... <s:message code="some code" arguments="${arg1,arg2}" />..
    some text  ........
    some text  ....... <s:message code="some code" 
     />...........

以下是我需要烫发的步骤

我已准备好列表中的代码列表。我想迭代每个代码列表然后捕获该代码周围的封闭消息标记if 代码匹配。
然后根据代码值做一些逻辑。
替换具有新值的消息。

基本上我需要根据代码类型替换所有文本<s:message code....>。例如，如果代码是code1，则用test1

sample output
    Some text........... new text..
    some text  ........
    some text  ....... new text...........

我没有得到如何执行步骤1，即捕获封闭的消息标记？

Answer 1

这是匹配代码所需的正则表达式。
如果你想让它变得聪明，你需要一个回调即基于代码，它位于捕获组2中。

为了替换目的，整个匹配是标记。

原始正则表达式：
<s:message(?=\s)(?=(?:[^>"']|"[^"]*"|'[^']*')*?\scode\s*=\s*(?:(['"])([\S\s]*?)\1))\s+(?:"[\S\s]*?"|'[\S\s]*?'|[^>]*?)+/>

弦乐正则表达式：
"<s:message(?=\\s)(?=(?:[^>\"']|\"[^\"]*\"|'[^']*')*?\\scode\\s*=\\s*(?:(['\"])([\\S\\s]*?)\\1))\\s+(?:\"[\\S\\s]*?\"|'[\\S\\s]*?'|[^>]*?)+/>"

经过测试：https://regex101.com/r/LgweAW/1

请注意，如果您要搜索代码的特定集，像1,4,22,9，在正则表达式中，只需替换此行

( [\S\s]*? ) # (2), The Code

使用您的特定正则表达式，如此

( (?:1|4|22|9) ) # (2), One of these Codes

可读版本：

                        # Begin Message tag
 < s:message
 (?= \s )
 (?=                    # Asserttion (a pseudo atomic group)
      (?: [^>"'] | " [^"]* " | ' [^']* ' )*?
      \s code \s* = \s* 
      (?:
           ( ['"] )               # (1), Quote
           ( [\S\s]*? )           # (2), The Code
           \1 
      )
 )
                        # Have the code, just match the rest of tag
 \s+ 
 (?: " [\S\s]*? " | ' [\S\s]*? ' | [^>]*? )+

 />                     # End self contained tag

查找包含特定文本的封闭标记？

1 个答案: