org.jsoup.Jsoup没有处理javascript链接?

时间:2016-11-23 09:33:32

标签: java jsoup xss

我正在尝试使用Java lib Jsoup清理包含可能的恶意内容(XSS)的文本字符串。我必须允许< a href =“http://www.link.com”> link< / a>链接,但我不想因XSS原因允许使用javascript链接。

以下测试用例将失败,因为仍然允许使用javascript协议。关于如何使用Jsoup内置函数解决这个问题的任何想法?

@Test
public void test() {

    Whitelist tWhitelist = Whitelist.none();

    tWhitelist.addAttributes("a", "href");
    tWhitelist.removeProtocols("a", "href", "javascript");      

    String tUnsafe = "<a href=\"javascript:alert(1)\">Link</a> is a link.";
    assertEquals("Link is a link.", Jsoup.clean(tUnsafe, tWhitelist));
}

    org.junit.ComparisonFailure: expected:<[Link] is a link.> but was:<[<a href="javascript:alert(1)">Link</a>] is a link.>

2 个答案:

答案 0 :(得分:1)

这是因为您在白名单中添加了a 标记,您可以直接使用none白名单,例如:

Whitelist tWhitelist = Whitelist.none();

String tUnsafe = "<a href=\"javascript:alert(1)\">Link</a> is a link.";
assertEquals("Link is a link.", Jsoup.clean(tUnsafe, tWhitelist));

或者您可以使用basic白名单来保留其他href,例如:

    Whitelist tWhitelist = Whitelist.basic();

    tWhitelist.removeProtocols("a", "href", "javascript");
    String tUnsafe = "<a href=\"javascript:alert(1)\">Link</a> is a link.<a href=\"http://www.google.com\" rel=\"nofollow\">google</a>";
    assertEquals("<a rel=\"nofollow\">Link is a link.</a><a href=\"http://www.google.com\" rel=\"nofollow\">google</a>", Jsoup.clean(tUnsafe, tWhitelist));

答案 1 :(得分:0)

发现自己...这将使指定的协议有效,但要删除的javascript协议

    Whitelist whitelist = Whitelist.none();

    whitelist
        .addTags("a")
        .addAttributes("a", "href")
        .addProtocols("a", "href", "http", "https", "mailto");

    String safeText = Jsoup.clean(untrustedText, whitelist);