使用C#匹配模式的正则表达式或LINQ

时间:2016-05-27 03:45:19

标签: c# regex linq

我正在编写一个小程序来提取其季节的Tv系列名称。 我有以下文本字符串列表。

Entourage Season 3 Part 2 | 5 NIGHT HIRE |
Entourage Season 4 | 5 NIGHT HIRE |
Entourage Season 5 | 5 NIGHT HIRE |
Entourage Season 8 | 5 NIGHT HIRE |
The Walking Dead: Season Four | 5 NIGHT HIRE |
The Walking Dead: Season Three| 5 NIGHT HIRE |
The Walking Dead: Season Two | 5 NIGHT HIRE |
The Walking Dead: Season One | 5 NIGHT HIRE |
Game Of Thrones: Season One| 5 NIGHT HIRE |
Game Of Thrones: Season Two | 5 NIGHT HIRE |

我需要按季节获得电视连续剧名称。

的Entourage

Season 3
Season 4
Season 5
Season 8

行尸走肉

Season One
Season Two
Season Three
Season Four

权力的游戏

Season One
Season Two

我有匹配季节的正则表达式,但它没有用。

  

匹配匹配= Regex.Match(this.Content,@“(?:^ |(?:[。!?] \ s))(?Season \ w +)”)

我需要在C#中使用regualr表达式或使用LINQ查询帮助。

1 个答案:

答案 0 :(得分:2)

这个linqpad脚本:

var text = @"
   Entourage Season 3 Part 2 | 5 NIGHT HIRE |
   Entourage Season 4 | 5 NIGHT HIRE |
   Entourage Season 5 | 5 NIGHT HIRE |
   Entourage Season 8 | 5 NIGHT HIRE |
   The Walking Dead: Season Four | 5 NIGHT HIRE |
   The Walking Dead: Season Three| 5 NIGHT HIRE |
   The Walking Dead: Season Two | 5 NIGHT HIRE |
   The Walking Dead: Season One | 5 NIGHT HIRE |
   Game Of Thrones: Season One| 5 NIGHT HIRE |
   Game Of Thrones: Season Two | 5 NIGHT HIRE |";
var matches = text.Split(new[]{"\r\n","\n"},0)
                  .Select(l => Regex.Match(l, @"^\s*(?<title>.+?)(?<season>Season \w+)"));
var data = matches.Where (m => m.Success)
                  .Select (m => new {Title = m.Groups["title"].Value.Trim(':',' '),Season = m.Groups["season"].Value});

data.GroupBy(d => d.Title)
    .Select (g => new {g.First().Title,Seasons = g.Select (x => x.Season)}).Dump();

返回:

enter image description here