正则表达式捕获带有目标短语的句子

时间:2011-10-26 01:12:30

标签: c# regex string

我正在使用C#查找博客帖子中可能存在或不存在的短语。我需要捕获包含目标短语的整个句子。

我考虑过使用string.contains方法但是当我想要的只是目标短语及其包含的句子时,它会返回整篇博文。

示例:

I dont want this sentence. I also don't want this setence. But I do want this sentence.

所以这里的目标短语是:“我愿意”并且正则表达式应该返回整个包含句子“但我确实想要这句话。”

感谢。 亚伦

3 个答案:

答案 0 :(得分:2)

这个正则表达式:

resultString = Regex.Match(subjectString, @"(?<=^|\.)[^.]*?(?=\bI do\b).*(\.|$)").Value;

应用于您的输入时:

I dont want this sentence. I also don't want this setence. But I do want this sentence.

返回:

But I do want this sentence.

如果您担心多行,请启用RegexOptions.Singleline。

答案 1 :(得分:1)

我不知道正则表达式,但您可以使用Split函数和Contains函数的组合,并编写如下内容:

string DoesBlogContainSentence(string blog, string target)
{
   string[] blogSentences = blog.Split(new char[] {'.'});

   foreach(string sentence in blogSentences)
   {
      if(sentence.Contains(target))
      {
          return sentence;
      }
   }

   return string.Empty;
}

答案 2 :(得分:1)

您可以将博客文章拆分为句子,然后在每个句子中搜索目标词组。

E.g。

  string data = "I dont want this sentence. I also don't want this setence. But I do want this sentence.";
  string targetPhrase = "I do";

  string[] sentences = Regex.Split(data, "\\.\\s");

  foreach (string sentence in sentences)
  {
    if (Regex.IsMatch(sentence, "\\s" + targetPhrase + "\\s"))
    {
      //.....
    }
  }