我得到不同的结果,任何人都可以告诉我为什么?
正则表达式:
[0-9]+(?:\s){0,10}(?:\r?\n?)([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3}) --> ([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3})(?:\s){0,10}(?:\r\n|\n|\r){1}(.*\r?\n?.*\r?\n?.*)(?:\n|\r)(?:\n|\r)
在Regex101上我使用' gm'改性剂。
在PHP上我使用:
preg_match_all($this->Pattern, $txt, $matches, PREG_SET_ORDER);
Regex101结果(看起来匹配4 - 这是正确的。模式只获得空行,没有任何"时间行文本"):
MATCH 1
1. [2-4] `00`
2. [5-7] `00`
3. [8-10] `01`
4. [11-14] `163`
5. [19-21] `00`
6. [22-24] `00`
7. [25-27] `05`
8. [28-31] `150`
9. [32-39] `aaaaaaa`
MATCH 2
1. [43-45] `00`
2. [46-48] `00`
3. [49-51] `05`
4. [52-55] `556`
5. [60-62] `00`
6. [63-65] `00`
7. [66-68] `05`
8. [69-72] `921`
9. [73-82] `bbbb
bbbb`
MATCH 3
1. [86-88] `00`
2. [89-91] `00`
3. [92-94] `07`
4. [95-98] `753`
5. [103-105] `00`
6. [106-108] `00`
7. [109-111] `08`
8. [112-115] `168`
9. [116-130] `cccccccccccccc`
MATCH 4
1. [134-136] `00`
2. [137-139] `00`
3. [140-142] `22`
4. [143-146] `854`
5. [151-153] `00`
6. [154-156] `00`
7. [157-159] `28`
8. [160-163] `721`
9. [164-164] ``
MATCH 5
1. [168-170] `00`
2. [171-173] `00`
3. [174-176] `23`
4. [177-180] `336`
5. [185-187] `00`
6. [188-190] `00`
7. [191-193] `31`
8. [194-197] `558`
9. [198-228] `dddddddddddddd
dddddddddddddd
`
MATCH 6
1. [232-234] `00`
2. [235-237] `00`
3. [238-240] `34`
4. [241-244] `228`
5. [249-251] `00`
6. [252-254] `00`
7. [255-257] `36`
8. [258-261] `296`
9. [262-276] `eeeeeeeeeeeeee`
MATCH 7
1. [280-282] `00`
2. [283-285] `00`
3. [286-288] `35`
4. [289-292] `165`
5. [297-299] `00`
6. [300-302] `00`
7. [303-305] `39`
8. [306-309] `785`
9. [310-320] `fffff
ffff`
我的服务器结果(查看" [3] =>数组",模式获得两个"时间线"):
(
[0] => Array
(
[0] => 1
00:00:01,163 --> 00:00:05,150
aaaaaaa
2
[1] => 00
[2] => 00
[3] => 01
[4] => 163
[5] => 00
[6] => 00
[7] => 05
[8] => 150
[9] => aaaaaaa
2
)
[1] => Array
(
[0] => 00:00:05,556 --> 00:00:05,921
bbbb
bbbb
[1] => 0
[2] => 00
[3] => 05
[4] => 556
[5] => 00
[6] => 00
[7] => 05
[8] => 921
[9] => bbbb
bbbb
)
[2] => Array
(
[0] => 3
00:00:07,753 --> 00:00:08,168
cccccccccccccc
4
[1] => 00
[2] => 00
[3] => 07
[4] => 753
[5] => 00
[6] => 00
[7] => 08
[8] => 168
[9] => cccccccccccccc
4
)
[3] => Array
(
[0] => 00:00:22,854 --> 00:00:28,721
5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
[1] => 0
[2] => 00
[3] => 22
[4] => 854
[5] => 00
[6] => 00
[7] => 28
[8] => 721
[9] => 5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
)
[4] => Array
(
[0] => 6
00:00:34,228 --> 00:00:36,296
eeeeeeeeeeeeee
7
[1] => 00
[2] => 00
[3] => 34
[4] => 228
[5] => 00
[6] => 00
[7] => 36
[8] => 296
[9] => eeeeeeeeeeeeee
7
)
[5] => Array
(
[0] => 00:00:35,165 --> 00:00:39,785
fffff
ffff
[1] => 0
[2] => 00
[3] => 35
[4] => 165
[5] => 00
[6] => 00
[7] => 39
[8] => 785
[9] => fffff
ffff
)
)
测试字符串:
1
00:00:01,163 --> 00:00:05,150
aaaaaaa
2
00:00:05,556 --> 00:00:05,921
bbbb
bbbb
3
00:00:07,753 --> 00:00:08,168
cccccccccccccc
4
00:00:22,854 --> 00:00:28,721
5
00:00:23,336 --> 00:00:31,558
dddddddddddddd
dddddddddddddd
6
00:00:34,228 --> 00:00:36,296
eeeeeeeeeeeeee
7
00:00:35,165 --> 00:00:39,785
fffff
ffff
答案 0 :(得分:1)
发生这种情况的原因是regex101(\n
)和输入(\r\n
)中的不同换行符样式。
您可以通过对任何类型的换行符使用统一的\R
模式轻松解决此问题。
注意我没有优化你的模式,我只是展示如何解决问题中陈述的问题:
'~[0-9]+\s{0,10}\R?([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3}) --> ([0-9]{1,2}):([0-9]{1,2}):([0-9]{1,2}),([0-9]{1,3})\s{0,10}\R(.*\R?.*\R?.*)\R{2}~'
请参阅PHP demo