Question

有一个块div，在他身上有一个未知数量的链接，比如“a href onclick”，如果有更多的链接，那么它们就分开了一个逗号和一个空格。

var reg = /<div class="labeled fl_l"><a href="[^"]*" onclick="[^"]*">(.+?)<\/a>(, <a href="[^"]*" onclick="[^"]*">(.+?)<\/a>{1,})?<\/div>/mg;
var arr;
while ((arr = reg.exec(data)) != null) {
            console.log(arr[0]); //contains the entire text (because it is java script)
    console.log(arr[1]); //contains the name of the first link
    console.log(arr[2]); //contains the following "a href" entirely (if I will point out (?: x, <a... /a>), then the nested brackets will not work)
    console.log(arr[3]); //contains the name of the second link, **and then all of the code**
}

}

我认为应该使用([^ <] *)代替(. +?)，但它根本不起作用。

Answer 1

如果使用正则表达式是理想的（它们不是），我会选择两个单独的表达式，一个用于查找＆lt; div class =“label fl_l”＆gt;之间的所有内容。和＆lt; / div＆gt;然后另一个找到每个链接。

但是，regular expressions aren't the right tool for the job.您可能希望考虑使用xPath迭代链接。

Regexp解析问题

1 个答案: