假设我有一根绳子,“猫猫猫狗狗狗。”
我将使用什么正则表达式来替换“猫与狗”的字符串。即删除重复项。但是,表达式只能删除彼此之后的重复项。例如:
“猫猫猫狗狗狗猫猫猫狗”会回来:
“猫与狗,猫与狗”
答案 0 :(得分:9)
resultString = Regex.Replace(subjectString, @"\b(\w+)(?:\s+\1\b)+", "$1");
将在一次通话中完成所有替换。
说明:
\b # assert that we are at a word boundary
# (we only want to match whole words)
(\w+) # match one word, capture into backreference #1
(?: # start of non-capturing, repeating group
\s+ # match at least one space
\1 # match the same word as previously captured
\b # as long as we match it completely
)+ # do this at least once
答案 1 :(得分:2)
将(\w+)\s+\1
替换为$1
在循环中执行此操作,直到找不到更多匹配项。设置global
标志是不够的,因为它不会替换cats
中的第三个cats cats cats
\1
指的是第一个捕获组的内容。
尝试:
str = "cats cats cats and dogs dogs dogs and cats cats and dogs dogs";
str = Regex.Replace(str, @"(\b\w+\b)\s+(\1(\s+|$))+", "$1 ");
Console.WriteLine(str);
答案 2 :(得分:1)
毫无疑问,有一个较小的正则表达式可能,但这个似乎可以解决问题:
string somestring = "cats cats cats and dogs dogs dogs and cats cats and dogs dogs";
Regex regex = new Regex(@"(\w+)\s(?:\1\s)*(?:\1(\s|$))");
string result = regex.Replace(somestring, "$1$2");
它还考虑到最后一只没有以空格结尾的“狗”。
答案 3 :(得分:0)
请尝试以下代码。
<小时/>
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
/// <summary>
///
/// A description of the regular expression:
///
/// Match expression but don't capture it. [^|\s+]
/// Select from 2 alternatives
/// Beginning of line or string
/// Whitespace, one or more repetitions
/// [1]: A numbered capture group. [(\w+)(?:\s+|$)]
/// (\w+)(?:\s+|$)
/// [2]: A numbered capture group. [\w+]
/// Alphanumeric, one or more repetitions
/// Match expression but don't capture it. [\s+|$]
/// Select from 2 alternatives
/// Whitespace, one or more repetitions
/// End of line or string
/// [3]: A numbered capture group. [\1|\2], one or more repetitions
/// Select from 2 alternatives
/// Backreference to capture number: 1
/// Backreference to capture number: 2
///
///
/// </summary>
class Class1
{
///
/// Point d'entrée principal de l'application.
///
static void Main(string[] args)
{
Regex regex = new Regex(
"(?:^|\s+)((\w+)(?:\s+|$))(\1|\2)+",
RegexOptions.IgnoreCase
| RegexOptions.Compiled
);
string str = "cats cats cats and dogs dogs dogs and cats cats and dogs dogs";
string regexReplace = " $1";
Console.WriteLine("Before :" + str);
str = regex.Replace(str,regexReplace);
Console.WriteLine("After :" + str);
}
}
}