C#Regex提取所有以数字开头且包含在特定列表中的单词?

时间:2011-06-21 19:08:50

标签: regex c#-4.0

给定一组单词组合,是否可以返回一组匹配的单词集并使用RegEx从给定的字符串中提取它们?

例如,给出一个汽车清单:

mazda 3
mazda 4
volvo s40

使用以下文字:
“我想买一辆马自达3然而我发现沃尔沃s40与90gv轮胎相比要好得多。”

我想要从中返回两个不同的列表:

{mazda 3, volvo s40, 90gv} 
{I, wanted, to, buy, a, however, I, found, the, to, be, a, much, better, deal, with, the, tires}

1 个答案:

答案 0 :(得分:1)

此代码使用MatchEvaluator进行匹配(汽车模型),并返回“”,因此模型将替换为空字符串。 cars是汽车模型列表。 words是剩余单词的列表。我会留给你根据你的需要妥善处理标点符号。

List<string> cars = new List<string>();
string input =
   "I wanted to buy a mazda 3 however I found the volvo s40 to be a much better deal.";
string line = Regex.Replace(
   input, @"\b\w+\s+(?=\S*?\d)(?:\w+)",
   m =>
      {
         cars.Add(m.Value);
         return "";
      });
string [] words = line.Split(' ');

// Ouput the lists:
Console.Write ("Cars:");
foreach (string car in cars)
   Console.Write(car + "    ");
Console.WriteLine ();
Console.Write ("words: ");
foreach (string word in words)
   Console.Write(word + " ");

生成此输出:

Cars:mazda 3    volvo s40
words: I wanted to buy a  however I found the  to be a much better deal.