我正在编写一个小程序来提取其季节的Tv系列名称。 我有以下文本字符串列表。
Entourage Season 3 Part 2 | 5 NIGHT HIRE | Entourage Season 4 | 5 NIGHT HIRE | Entourage Season 5 | 5 NIGHT HIRE | Entourage Season 8 | 5 NIGHT HIRE | The Walking Dead: Season Four | 5 NIGHT HIRE | The Walking Dead: Season Three| 5 NIGHT HIRE | The Walking Dead: Season Two | 5 NIGHT HIRE | The Walking Dead: Season One | 5 NIGHT HIRE | Game Of Thrones: Season One| 5 NIGHT HIRE | Game Of Thrones: Season Two | 5 NIGHT HIRE |
我需要按季节获得电视连续剧名称。
的Entourage
Season 3 Season 4 Season 5 Season 8
行尸走肉
Season One Season Two Season Three Season Four
权力的游戏
Season One Season Two
我有匹配季节的正则表达式,但它没有用。
匹配匹配= Regex.Match(this.Content,@“(?:^ |(?:[。!?] \ s))(?Season \ w +)”)
我需要在C#中使用regualr表达式或使用LINQ查询帮助。
答案 0 :(得分:2)
这个linqpad脚本:
var text = @"
Entourage Season 3 Part 2 | 5 NIGHT HIRE |
Entourage Season 4 | 5 NIGHT HIRE |
Entourage Season 5 | 5 NIGHT HIRE |
Entourage Season 8 | 5 NIGHT HIRE |
The Walking Dead: Season Four | 5 NIGHT HIRE |
The Walking Dead: Season Three| 5 NIGHT HIRE |
The Walking Dead: Season Two | 5 NIGHT HIRE |
The Walking Dead: Season One | 5 NIGHT HIRE |
Game Of Thrones: Season One| 5 NIGHT HIRE |
Game Of Thrones: Season Two | 5 NIGHT HIRE |";
var matches = text.Split(new[]{"\r\n","\n"},0)
.Select(l => Regex.Match(l, @"^\s*(?<title>.+?)(?<season>Season \w+)"));
var data = matches.Where (m => m.Success)
.Select (m => new {Title = m.Groups["title"].Value.Trim(':',' '),Season = m.Groups["season"].Value});
data.GroupBy(d => d.Title)
.Select (g => new {g.First().Title,Seasons = g.Select (x => x.Season)}).Dump();
返回: