正则表达式:匹配所有连字符,并用空格替换只包含字母且不包含在引号内的单词

时间:2018-11-23 20:23:40

标签: java regex clojure

此正则表达式:\b([A-z*]+)-(?=[A-z*]+\b)

替换为:$1 

应用于:

Jean-Pierre bought "blue-green-red" product-2345 and other blue-red stuff.

给我:

Jean Pierre bought "blue green red" product-2345 and other blue red stuff.

我想要的时候:

Jean Pierre bought "blue-green-red" product-2345 and other blue red stuff.

https://regex101.com/r/SJzAaP/1

编辑:

我正在使用Clojure(Java)

编辑2:

yellow-black-white-> yellow black white

product_a-b-> product_a-b

编辑3:接受的答案翻译成Clojure

(clojure.string/replace
 "Jean-Pierre bought \"blue-green-red\" product-2345 and other blue-red-green stuff yellow-black-white product_a-b"
 #"(\"[^\"]*\")|\b([a-zA-Z]+)-(?=[a-zA-Z]+\b)"
 (fn [[s1 s2 s3]] (if s2 s1 (str s3 " "))))

;;=> "Jean Pierre bought \"blue-green-red\" product-2345 and other blue red green stuff yellow black white product_a-b"

3 个答案:

答案 0 :(得分:1)

在Java中,您可以使用类似的

String s = "Jean-Pierre bought \"blue-green-red\" product-2345 and other blue-red stuff. yellow-black-white. product_a-b";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("(\"[^\"]*\")|\\b([a-zA-Z]+)-(?=[a-zA-Z]+\\b)").matcher(s);
while (m.find()) {
    if (m.group(1) != null) {
        m.appendReplacement(result, m.group(0));
    } else {
        m.appendReplacement(result, m.group(2) + " ");
    }
}
m.appendTail(result);
System.out.println(result.toString());
// => Jean Pierre bought "blue-green-red" product-2345 and other blue red stuff. yellow black white. product_a-b

请参见Java demo

正则表达式为

("[^"]*")|\b([a-zA-Z]+)-(?=[a-zA-Z]+\b)

详细信息

  • ("[^"]*")-第1组:",除了""以外的0个字符以上
  • |-或
  • \b-单词边界 -([a-zA-Z]+)-第2组:1个以上的字母(可以替换为(\p{L}+)以匹配任何字母)
  • --连字符
  • (?=[a-zA-Z]+\b)-当前位置右侧的正向前进,需要1个以上的字母和一个单词边界。

如果第1组匹配(if (m.group(1) != null)),则只需将匹配项粘贴回结果中。如果没有,请粘贴第2组的值和空格。

也从问题中添加代码,以提高可见度:

(def s "Jean-Pierre bought \"blue-green-red\" product-2345 and other blue-red stuff. yellow-black-white. product_a-b"

(defn append [[g1 g2 g3]] (if g2 g1 (str g3 " ")))

(clojure.string/replace s #"(\"[^\"]*\")|\b([a-zA-Z]+)-(?=[a-zA-Z]+\b)" append)

;;=> "Jean Pierre bought \"blue-green-red\" product-2345 and other blue red stuff. yellow black white. product_a-b"

答案 1 :(得分:0)

如果您不需要处理太复杂的情况,这应该可以工作:

(?: |^)\w+(-)(?![0-9])\w+

这匹配word(hyphen)word的任何实例,该实例在行首或行首都有空格(因此,引号中的内容将不匹配,因为引号之前将带有引号,而不是空格或行的开头)。

让我知道这是否对您不起作用。 Live demo

答案 2 :(得分:0)

尝试这个

${group} $1

有替换

db.col.save({ data: [1,1,1,2,2,2] })
db.col.save({ data: [1,1,1,0,0,0] })

您可以here对其进行测试