避免某些情况下的拆分文本(Javascript)

时间:2013-08-20 14:54:11

标签: javascript split

我正在尝试将这样的html拆分为字符串:

<p class='class1'> Hello, this is my html </p>

我需要的是用空格分割html忽略分割中的html标签,我目前得到这个结果:

["<p","class='class1'>","Hello,","this","is","my","html","</p>"]

但我需要它将标签视为一个完整的单词,以获得此结果:

["<p class='class1'>","Hello,","this","is","my","html","</p>"]

我怎样才能得到这个结果?

修改

在javascript方面,我使用的是简单的拆分:

var text = "<p class='class1'> Hello, this is my html </p>";
var splitText = text.split(' ');

在这种情况下,splitText将是:

["<p","class='class1'>","Hello,","this","is","my","html","</p>"]

我尝试使用正常的表达式,例如“/[<.*?>,\s]+/”,但我的结果是:

var text = "<p class='class1'> Hello, this is my html </p>";
var splitText = text.split(/[<.*?>,\s]+/);

splitText = ["p class='class1'","Hello,","this","is","my","html","/p"]

提前致谢。

2 个答案:

答案 0 :(得分:1)

var a = $("<p class='class1'>Hello, this is my html</p>");
var b = a.html().split(' ');
a.html('');
var c = a[0].outerHTML.split('><');
b.splice(0, 0,c[0]+'>');
b.splice(b.length+1, 0,'<'+c[1]);

b将导致:["<p class="class1">", "Hello,", "this", "is", "my", "html", "</p>"]

注意:此代码仅适用于一个维度标记

答案 1 :(得分:1)

我通过使用简单的正则表达式和匹配方法得到了这个。

var text = "<p class='class1'><p class='class2'>Hello world!</p></p>";
var splitText = text.match(/[\<].+?[\>]+|[^\s]+/g);

//splitText -> 
//["<p class='class1'>","<p class='class2'>","Hello","world!","</p>","</p>"]

感谢@MaveRick的回答:)