替换单词之间的下划线(reg.exp)

时间:2010-03-08 17:53:41

标签: java jquery regex xslt replace

我需要一个正则表达式来解决以下问题(也欢迎链接到类似问题,相关教程等):

"__some_words_a_b___" => "__some words a b___"
"____" => "____"
"some___words" => "some   words"

所以我希望用空格替换单词之间的下划线并保持前导和尾随下划线。我发现了这个:

^[ \t]+|[ \t]+$

我认为它最像是那样的。我将在jQuery,Java(stdlibs)和XSLT中使用它。

增加: 句子不一定以下划线开头或以下划线结尾。句子也可能根本不包含下划线。多个下划线应呈现为多个空格

祝你好运 Lasse Espeholt

4 个答案:

答案 0 :(得分:3)

这应该适用于Javascript:

var newString = oldString.replace(/([^_].*?)_(?=[^_|^\s])/g,"$1 ");

编辑:如果你已经在字符串中有空格,可能需要添加以下内容:

var newString = oldString.replace(/([^_|\s].*?)_(?=[^_|^s])/g,"$1 ");

我忘记了任何其他边缘情况? :)哦,是的,另一个边缘案例。如果后跟空格(如换行符,行尾等),请保留结尾下划线。

编辑:如果单词> 1

之间的下划线数量的替代解决方案
var arrayString = oldString.replace(/^(_+)(.*?)(_+)$/g,"$1;$2;$3");
var a = arrayString.split(";");
var newString = a[0]+a[1].replace(/_/g," ")+a[2];

答案 1 :(得分:1)

我认为使用正则表达式和字符串替换会更简单。这是Python的答案,因为我对jQuery,Java或XSLT不够熟悉:

import re

def mangle_string(string):
    """
    Replace underscores between letters with spaces, leave leading and
    trailing underscores alone.
    """
    # Match a string that starts with zero or more underscores, followed by a
    # non-underscore, followed by zero or more of any characters, followed by
    # another non-underscore, followed by zero or more underscores, then the
    # end of the string.  If the string doesn't match that pattern, then return
    # it unmodified.
    m = re.search(r'^(_*)([^_]+.*[^_]+)(_*)$', string)
    if not m:
        return string
    # Return the concatentation of first group (the leading underscores), then
    # the middle group (everything else) with any internal underscores
    # replaced with spaces, then the last group (the trailing underscores).
    return m.group(1) + m.group(2).replace('_', ' ') + m.group(3)

答案 2 :(得分:0)

也许这就是你想要的(Javascript):

var newString = oldString.replace(/(\w)_(\w)/g, "$1 $2");

如果单词之间可以有许多下划线,那么:

var newString = oldString.replace(/(\w)_+(\w)/g, "$1 $2");

如果你想保留与下划线相同数量的空格:

var newString = oldString.replace(/(\w)(_+)(\w)/g, function(_, l1, u, l2) {
  return l1 + (u.length == 1 ? ' ' : (new Array(u.length - 1).join(' '))) + l2;
});

答案 3 :(得分:0)

我不会为此使用RegEx。我将计算前导和尾随下划线,然后使用middle.replace('_',' ')和尾随子串(如果有)加入前导子串(如果有)。如果前导下划线运行到最后,则立即返回原始字符串。