使用jquery从html字符串中提取文本

时间:2012-04-11 02:04:54

标签: jquery html string text extraction

html_string "<span class=\"verse\"><strong>1<\/strong>\u00a0hello world how are you?,<span class=\"trans\" title=\"\u00a0Greek brothers.\">t<\/span> I am fine thank you.<\/span><span class=\"verse\"><strong>2<\/strong>\u00a0this world is very bad.<\/span><span class=\"verse\"><strong>3<\/strong>\u00aall me are good,<\/span>"

我只想提取具有 verse 类的span中的文本,并且不应该包含来自span trans 的文本。

结果必须是数组形式。

从上面的字符串

我必须得到像这样的结果

["\u00a0hello world how are you? I am fine thank you","\u00a0this world is very bad.","\u00aall me are good,"]

由于

1 个答案:

答案 0 :(得分:1)

这个怎么样:

var html_string = "<span class=\"verse\">"
                    + "<strong>1</strong>\u00a0hello world how are you?,"
                    + "<span class=\"trans\" title=\"\u00a0Greek brothers.\">t<\/span> "
                    + "I am fine thank you."
                + "</span>"
                + "<span class=\"verse\">"
                    + "<strong>2</strong>\u00a0this world is very bad."
                + "</span>"
                + "<span class=\"verse\">"
                    + "<strong>3<\/strong>\u00aall me are good,"
                + "</span>";

// Build up the DOM in a hidden element we can parse through
var $html = $('<div>',{html:html_string}).hide().appendTo('body');

// loop through each verse    
$html.find('.verse').each(function(i,e){
    // grab the verse, but first delete any span.trans
    var verse = $(e).find('span.trans').remove().end().text();
    // console.log(verse);
});
// remove the html from the DOM
$html.remove();

您应该能够在不将其添加到DOM的情况下循环它,但由于某种原因我无法使其工作。可能是因为它已经晚了或者我的手指在晚上罢工。无论哪种方式,如果有人在没有附加的情况下找到了办法,请发帖,我会投票。