我需要解析一些srt文件,我正在寻找与时间段匹配的正则表达式(对于JAVA)。我想要的是逐行读取文件,如果行是数字或时间段跳过它们。
示例,给定:
1
00:00:01,357 --> 00:00:03,323
You took this case
without running it by me.
2
00:00:03,359 --> 00:00:04,825
- Jessica--
- That's enough. Dump it.
我想匹配
行00:00:03,359 --> 00:00:04,825
和
2
提前致谢!
答案 0 :(得分:2)
比赛号码:
^ \ d + $
比赛时间
^ \ d {2}:\ d {2}:\ d {2},\ d {3} * \ d {2}:\ d {2}:\ d {2},\ d { 3} $
对于这两种情况
(^ \ d + $)|(^ \ d {2}:\ d {2}:\ d {2},\ d {3} * \ d {2}:\ d {2}:\ d {2},\ d {3} $)
正如我在你的格式中看到的那样,数字是在时间之前所以你只需要使用匹配时间获得行索引并按索引1和索引重新移动
更清楚正则表达式时间
^ \ d {2}:\ d {2}:\ d {2},\ d {3} * \ d {2}:\ d {2}:\ d {2},\ d { 3} $
开始
^
从文本的开头
\ d {2}或[0-9] {2}
仅限两位数
:或:{1}或[:] {1}
一个逗号:仅限 ...
,或{1}或[,] {1}
一个逗号,只有
\ d {3}或[0-9] {3}
仅限三位数
*
每件事,有没有价值都可以
过去:再次检查时间格式
$
文字结尾
这意味着该文本的结尾必须匹配该条件
答案 1 :(得分:0)
00:00:03,359 - > 00:00:04,825 或 00:00:01,357 - > 00:00:03,323 下面的代码可能很有用。
String strLine = "00:00:01,357 --> 00:00:03,323";
System.out.println(strLine.matches("\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d --> \\d\\d:\\d\\d:\\d\\d,\\d\\d\\d"));
答案 2 :(得分:0)
你可以这样做以获取每个副标题的结束时间:
\d{2}:\d{2}:\d{2},\d{3}$
解释
\d{2}: # a two-digits number followed by a ":" character
\d{2}: # ""
\d{2}, # a two-digits number followed by a "," character
\d{3} # a three-digits number
$ # matching only at ending lines