我有一个测试列表,我试图使用正则表达式捕获数据。
以下是文字格式的示例:
(1) this is a sample string /(2) something strange /(3) another bit of text /(4) the last one/ something!/
我有一个目前正确捕获的正则表达式,但我在使它在异常情况下工作时遇到一些困难。
这是我的正则表达式
/\(?\d\d?\)([^\)]+)(\/|\z)/
不幸的是,有些数据包含如下括号:
(1) this is a sample string (1998-1999) /(2) something strange (blah) /(3) another bit of text /(4) the last one/ something!/
子串('1998-1999)'和'(blah)'使它失败!
任何人都想关注这个问题吗? 谢谢:D
答案 0 :(得分:1)
我会试试这个:
\((\d+)\)\s+(.*?)(?=/(?:\(\d+\)|\z))
这个相当可怕的正则表达式执行以下操作:
[^/]+
)的首选方式; (?=...)
)表示表达式必须后跟反斜杠,然后是以下之一:
以PHP为例(您没有指定语言):
$s = '(1) this is a sample string (1998-1999) /(2) something strange (blah) /(3) another bit of text /(4) the last one/ something!/';
preg_match_all('!\((\d+)\)\s+(.*?)(?=/(?:\(\d+\)|\z))!', $s, $matches);
print_r($matches);
输出:
Array
(
[0] => Array
(
[0] => (1) this is a sample string (1998-1999)
[1] => (2) something strange (blah)
[2] => (3) another bit of text
[3] => (4) the last one/ something!
)
[1] => Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
)
[2] => Array
(
[0] => this is a sample string (1998-1999)
[1] => something strange (blah)
[2] => another bit of text
[3] => the last one/ something!
)
)
一些注意事项:
\d+
替换为\d\d?
。答案 1 :(得分:1)
将/
添加到字符串的开头,将(0)
附加到字符串的末尾,然后使用模式\/\(\d+\)
拆分整个字符串,并丢弃第一个和最后一个字符串空元素。
答案 2 :(得分:1)
只要/不能出现在文本中......
\(?\d?\d[^/]+