Question

我有一个srt文件

1
00:00:07,000 --> 00:00:09,000
Time to amaze the world..
create by Hazy

2
00:00:11,000 --> 00:00:12,200
show them

3
00:00:15,000 --> 00:00:16,500
an impossible feat

我想获取文字内容

Time to amaze the world..
create by Hazy,
show them,
an impossible feat

我的正则表达式：

string[] souceSrt = Regex.Split(inputText.Text, @"\n*\d+\n\d\d:\d\d:\d\d,\d\d\d --> \d\d:\d\d:\d\d,\d\d\d\n");

但它不起作用。我该怎么办？

Answer 1

您的方法并不差，我认为您的模式因换行符（可能是CRLF）而无效：

(?:\r?\n)*\d+\r?\n\d{2}:\d{2}:\d{2},\d{3} --> \d{2}:\d{2}:\d{2},\d{3}\r?\n

请注意，您的第一种方法比搜索包含字母的所有行更安全（想象一个字符说“你多大了？”）

Answer 2

使用RegexHero

string strRegex = @"^.*([a-zA-Z]).*$";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline);

foreach (Match myMatch in myRegex.Matches(strTargetString))
{
   if (myMatch.Success)
   {
     //grab line
   }
}

除非我错过了一些东西，否则你不想要的行将永远不会有字母字符。

C＃ - 正则表达式字幕文件（.srt）来获取文本内容？

2 个答案: