在“。”之间获取一个短语。并且有一个特定的词

时间:2019-01-17 18:11:48

标签: c# .net regex .net-core

假设我有以下文字:

  

“当我长大的时候,我们住在一间带地下室的小房子里。”妈妈用覆盖混凝土地板的地毯以及我们可以玩的沙发和椅子使地下室变得舒适。 ,这就是我们存放大部分玩具和珍贵物品的地方。

     

我们在那些木制楼梯上上下下了很多次,过了一会儿,它们开始显得很sc脚。妈妈决定她要去油漆。那是在快干涂料投入使用的那几天,并且涂料要花整整一天的时间才能干燥。 “

我需要一个以“。”分隔的正则表达式。并且包含两个特定的单词(例如->地下室),结果将是:

  

“妈妈用覆盖混凝土地板的地毯和我们可以玩的沙发和椅子使地下室变得舒适。”

2 个答案:

答案 0 :(得分:1)

您可以使用此正则表达式,

[A-Z][^.]*the basement[^.]*\.

说明:

[A-Z]-该正则表达式以大写字母开头,而句子以大写字母开头。

[^.]*-然后可以跟着零个或多个除文字点之外的任何字符

the basement-紧随其后的是文本。

[^.]*-然后可以在其后跟零个或多个除文字点之外的其他任何字符

\.-最终以文字点结束

Live Demo

答案 1 :(得分:0)

这是一个比较健壮的解决方案。它处理句点(即句号),但不处理“点”(例如“ 8:00 a.m.”或“ e.g。”)。

void Main()
{
    var s = 

    @"When I was growing up, we lived in a little house with a full basement. Mom made the basement cozy with a rug covering the concrete floor and a couch and chair that we could play on. , and that was where we kept most of our toys and the things we treasured.

    We went up and down those wooden stairs many times, and after a while they began to look pretty scuffed and scruffy. Mom decided she was going to paint them. That was in the days before quick-drying paints came into use, and it would take a full day for the paint to dry.";

    Console.WriteLine(Foo(s, "the", "basement"));

}

IEnumerable<string> Foo(string s, params string[] words)
{
    var regexes = from w in words select new Regex(w, RegexOptions.IgnoreCase);

    var xs = new Stack<List<char>>();
    xs.Push(new List<char>());

    foreach (var c in s)
    {
        xs.Peek().Add(c);

        if(c == '.')
            xs.Push(new List<char>());
    }               

    var candidates = xs.Reverse().Select (x => new string(x.ToArray()) );

    foreach (var candidate in candidates)
        if(regexes.All(x => x.IsMatch(candidate)))
            yield return candidate;
}