在javascript中拆分,不带有特定文本的标签

时间:2014-05-22 09:37:18

标签: javascript regex

我试图拆分字符串

<div id = 'tostart'><button>todo </button>hometown todo </div>

以“to”作为关键字。

问题是我不必在标签之间拆分,只需从标签外部拆分,所以如果我拆分我得到的结果如

    arr = ["<div id = 'tostart'><button>","do","</button>home","wn ","do </div>"]

是否存在可以实现的正则表达式。

提前致谢。

3 个答案:

答案 0 :(得分:1)

使用它:

var str = "<div id = 'tostart'><button>todo </button>hometown todo </div>";
var res = str.replace(/to/g, '|').replace(/(.*?)(<.*?)\|(.*?>)/g, '$1$2to$3');
console.log(res.split("\|"));

输出:

["<div id = 'tostart'><button>", "do </button>home", "wn ", "do </div>"]

@musefan:

这实际上是作为即兴创作完成的。

首先我用to替换了所有|,然后我选择了<>内的所有管道,并将其替换为to 。最后,我可以根据前一个替换留下的|进行拆分。

正则表达式:(.*?)(<.*?)\|(.*?>)

将选择|<

内的所有>个字符

答案 1 :(得分:0)

这是一个非常快速和肮脏的功能,可以做你想要的。请注意,可能有一种更有效的方法,并且它不能满足可能属于属性值的任何>个字符。但是,它适用于您的示例输入:

function splitNonTag(input, splitText) {
    var inTag = false;//flag to check if we are in a tag or not
    var result = [];//array for storing results
    var temp = "";//string to store current result

    for (var i = 0; i < input.length; i++) {
        var c = input[i];//get the current character to process

        //check if we are not in a tag and have found a split match
        if (!inTag && input.substring(i).indexOf(splitText) == 0) {
            result.push(temp);//add the split data to the results
            temp = "";//clear the buffer ready for next set of split data
            i += splitText.length - 1;//skip to the end of the split delimiter as we don't keep this data
            continue;//continue directly to next iteration of for loop
        }

        temp += c;//append current character to buffer as this is part of the split data

        //check if we are entering, or exiting a tag and set the flag as needed
        if (c == '<') inTag = true;
        else if (c == '>') inTag = false;
    }
    //if we have any left over buffer data then this should become the last split result item
    if (temp) 
        result.push(temp);

    return result;
}
var input = "<div id = 'tostart'><button>todo </button>hometown todo </div>";
var result = splitNonTag(input, 'to');
console.log(result);

Here is a working example

答案 2 :(得分:0)

我依靠您的HTML使用&lt;&gt;来逃避浏览器容忍的流浪< >,但验证者却没有!

str.split(/to(?=[^>]*(?=<|$))/g);

正如其他人所说,正则表达式不适用于非常混乱的HTML(例如内联脚本元素)。