Java jsoup-清除链接以外的所有标签

时间:2018-08-01 08:36:18

标签: java android jsoup strip-tags

输入字符串:

<b>Test link</b> <a href="https://www.w3schools.com">Visit W3Schools</a>

预期结果:

Test link <a href="https://www.w3schools.com">Visit W3Schools</a>

我尝试使用jsoup:

public String cleanHtml(String html)
    {
        Whitelist whitelist = Whitelist.none();
        whitelist.addTags("a");

        return Jsoup.clean(html, whitelist);
    }

结果是:

Test link <a>Visit W3Schools</a>

如何删除所有标签,但保留整个a href

1 个答案:

答案 0 :(得分:4)

您需要使用addAttributes。在这里您可以传递允许的属性列表,whitelist.addAttributes("a","href","id","more");

尝试一下:

String html = "<b>Test link</b> <a href=\"https://www.w3schools.com\">Visit W3Schools</a>";
    Whitelist whitelist = Whitelist.none();
    whitelist.addTags("a");
    whitelist.addAttributes("a","href");

    System.out.println(Jsoup.clean(html, whitelist));