Question

所以我想创建一个Java应用程序，它抓取名为chillstep.info的网站的Songname并将其保存到.txt文件中。然而，JSoup打印出来：
<div id="titel"> ♫ </div>

以下是代码：

public class Crawltitle {

    public static void getTitle() throws IOException{
        Document doc = Jsoup.connect("http://chillstep.info/").get();
        String title = doc.getElementById("titel").outerHtml();
        System.out.println(title);
    }

    public static void main(String[] args) throws IOException{
        getTitle();
    }
}

这个问题是因为网站（如果是，为什么以及如何解决这个问题）或JSoups？

Answer 1

标题是通过

动态加载的

http://chillstep.info/jsonInfo.php

如果忽略通常允许的内容类型，您仍然可以使用Jsoup来获取此内容：

Connection con = Jsoup
   .connect("http://chillstep.info/jsonInfo.php")
   .ignoreContentType(true);    
Response res = con.execute();
String rawJSON = res.body();

请注意，我没有使用JSoup解析器。所以你也可以使用任何其他库来获取HTTP内容，比如Apache HtmlClient等。

此时，您可以使用您选择的json库解析应答器。或者这样做＆＃34;手工＆＃34;因为它很简单：

String title = rawJSON.replaceAll(".*:\"([^\"]*).*","$1");

JSoup没有显示正确的文字

1 个答案: