通过空格和括号将句子(HTML代码)拆分为单词

时间:2015-06-10 10:02:03

标签: php regex string preg-split

例如,我想拆分以下HTML代码:

<a href="http://www.google.com">Google</a>

输出应该用空格和尖括号分隔

Array(
[0] => <
[1] => a
[2] => href="http://www.google.com"
[4] => >
[5] => Google
[6] => <
[7] => /a
[8] => >

2 个答案:

答案 0 :(得分:2)

不确定您要实现的目标,但是对于您的示例,您可以使用包含结果中捕获的分隔符部分的选项PREG_SPLIT_DELIM_CAPTURE

$result = preg_split('/([<>])| /', $txt, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

答案 1 :(得分:0)

>>> '<a href="http://www.google.com">Google</a>'
    .match(/(<)(\w+)\s+(href=\"[\w\/:.]+\")\s*(>)(.*)?(<)(\/\w)(>)/)

 ["<a href="http://www.google.com">Google</a>" 
  ,"<", "a", "href="http://www.google.com"", ">", "Google", "<", "/a", ">"]