Question

我需要提取Google Play应用的类别。例如，Facebook属于＆＃34; Social＆＃34;类别。

所以我需要从这个link中提取社交信息。我能够在名为＆＃34;结果＆＃34;的字符串中获取HTML内容。在以下代码中。但我无法找到包含类别名称的标签。我可以在检查元素时查看类别名称，但不能在代码中查看。如何获取上述URL的完整html内容，代码中的URL没有完整的HTML内容。类别名称在 html，head，Script，body，div，＆＃34; Category Name＆＃34;。

当我阅读完整的HTML回复时，我只会获得以下标记元素：<html>，<head>，<script>，但我没有得到<body>元素，它的内容。为什么页面的正文内容没有返回？

以下代码输出查询页面的HTML响应。

String url = "https://play.google.com/store/apps/details?id=com.kongregate.mobile.fly.google&hl=en";
InputStream inputStream = null;
String result = "";

try {

    // create HttpClient
    HttpClient httpclient = new DefaultHttpClient();

    // make GET request to the given URL
    HttpResponse httpResponse = httpclient.execute(new HttpGet(url));
    EntityUtils.toString(httpResponse.getEntity());
    inputStream = httpResponse.getEntity().getContent();

    // convert InputStream to String
    if (inputStream != null) {
        BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(inputStream, "UTF-8"));
        String line = "";

        while((line = bufferedReader.readLine()) != null) {
            result += line;
        }
    }
    // ...
} catch(...) {...}

Answer 1

也许这有帮助，代码将整个网站作为文档返回：

org.jsoup.nodes.Document html = null;
try {
    html = Jsoup.connect(source).get();
} catch (final IOException e) {
    LOG.error(e.getMessage(), e);
}
LOG.info(html);

使用Jsoup

我没有找到你的“类别名称”节点，但也许你会再次;）您可以像这样搜索文档：

html.select("#Category Name");

more examples

解析URL并检索信息

1 个答案: