是否有正则表达式会采用以下句子:
“我想把它分成两对”
并生成以下列表:
“我想要”, “想要这个”, “这种分裂”, “分开”, “进入”, “成对”
答案 0 :(得分:5)
由于需要重复使用单词,因此需要先行断言:
Regex regexObj = new Regex(
@"( # Match and capture in backreference no. 1:
\w+ # one or more alphanumeric characters
\s+ # one or more whitespace characters.
) # End of capturing group 1.
(?= # Assert that there follows...
(\w+) # another word; capture that into backref 2.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
resultList.Add(matchResult.Groups[1].Value + matchResult.Groups[2].Value);
matchResult = matchResult.NextMatch();
}
对于三人组:
Regex regexObj = new Regex(
@"( # Match and capture in backreference no. 1:
\w+ # one or more alphanumeric characters
\s+ # one or more whitespace characters.
) # End of capturing group 1.
(?= # Assert that there follows...
( # and capture...
\w+ # another word,
\s+ # whitespace,
\w+ # word.
) # End of capturing group 2.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);
等
答案 1 :(得分:4)
你可以做到
var myWords = myString.Split(' ');
var myPairs = myWords.Take(myWords.Length - 1)
.Select((w, i) => w + " " + myWords[i + 1]);
答案 2 :(得分:3)
您可以使用string.Split()
并合并结果:
var words = myString.Split(new char[] { ' ' });
var pairs = new List<string>();
for (int i = 0; i < words.Length - 1; i++)
{
pairs.Add(words[i] + words[i+1]);
}
答案 3 :(得分:0)
要仅使用RegEx并且不进行后期处理,我们可以重复使用Tim Pietzcker的答案,但是连续两次通过RegEx
我们可以从Tim Pietzcker的答案中传递原文,并且同样具有后视图,这将使正则表达式从第二个单词开始捕获。
如果您合并两个RegEx的结果,您将获得文本中的所有对。
Regex regexObj1 = new Regex(
@"( # Match and capture in backreference no. 1:
\w+ # one or more alphanumeric characters
\s+ # one or more whitespace characters.
) # End of capturing group 1.
(?= # Assert that there follows...
(\w+) # another word; capture that into backref 2.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);
Match matchResult = regexObj.Match(subjectString);
while (matchResult.Success) {
resultList.Add(matchResult.Groups[1].Value + matchResult.Groups[2].Value);
matchResult = matchResult.NextMatch();
}
Regex regexObj2 = new Regex(
@"(?<= # Assert that there preceds and will not be captured
\w+\s+ # the first word followed by any space
)
( # Match and capture in backreference no. 1:
\w+ # one or more alphanumeric characters
\s+ # one or more whitespace characters.
) # End of capturing group 1.
(?= # Assert that there follows...
(\w+) # another word; capture that into backref 2.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);
Match matchResult1 = regexObj1.Match(subjectString);
Match matchResult2 = regexObj2.Match(subjectString);
等
对于三人组:
您需要在程序中添加第三个RegEx:
Regex regexObj3 = new Regex(
@"(?<= # Assert that there preceds and will not be captured
\w+\s+\w+\s+ # the first and second word followed by any space
)
( # Match and capture in backreference no. 1:
\w+ # one or more alphanumeric characters
\s+ # one or more whitespace characters.
) # End of capturing group 1.
(?= # Assert that there follows...
(\w+) # another word; capture that into backref 2.
) # End of lookahead.",
RegexOptions.IgnorePatternWhitespace);
Match matchResult1 = regexObj1.Match(subjectString);
Match matchResult2 = regexObj2.Match(subjectString);
Match matchResult3 = regexObj3.Match(subjectString);