Question

我试图以最优雅的方式剥离和替换看起来如下的文本字符串：

element {"item"} {text {
          } {$i/child::itemno}

看起来像：

<item> {$i/child::itemno}

因此删除元素文本替换其大括号并删除文本及其随附的大括号。这些模式可能会遭遇多次。我最好使用Java的java.util.regex.Pattern还是简单的replaceAll OR org.apache.commons.lang.StringUtils？

感谢您的回复：

我现在有以下内容，但我不确定反斜杠的数量以及如何完成使用我的组（1）的最终替换，并将其替换为＆lt;在它的开始和＆gt;最后：

 Pattern p = Pattern.compile("/element\\s*\\{\"([^\"]+)\"\\}\\s*{text\\s*{\\s*}\\s*({[^}]*})/ ");
             // Split input with the pattern
        Matcher m = p.matcher("element {\"item\"} {text {\n" +
                "          } {$i/child::itemno} text { \n" +
                "            } {$i/child::description} text {\n" +
                "            } element {\"high_bid\"} {{max($b/child::bid)}}  text {\n" +
                "        }}  ");

            // For each instance of group 1, replace it with < > at the start and end

Answer 1

查找

/element\s*\{"([^"]+)"\}\s*{text\s*{\s*}\s*({[^}]*})/

替换：

"<$1> $2"

Answer 2

我认为一个简单的字符串替换就行了。这是一个Python版本（可以变成一个单行）：

>>> a = """element {"item"} {text {
          } {$i/child::itemno}"""
>>> 
>>> a
'element {"item"} {text {\n          } {$i/child::itemno}'
>>> a=a.replace(' ', '').replace('\n', '')
>>> a
'element{"item"}{text{}{$i/child::itemno}'
>>> a = a.replace('element {"', '<')
>>> a
'element{"item"}{text{}{$i/child::itemno}'
>>> a = a.replace('element{"', '<')
>>> a
'<item"}{text{}{$i/child::itemno}'
>>> a = a.replace('"}{text{}', '> ')
>>> a
'<item> {$i/child::itemno}'
>>>

最优雅的方式来剥离和替换String模式

2 个答案: