YellowPages刮痧

时间:2013-12-29 04:56:28

标签: java jsoup

我想解析黄页网站。 http://www.yellowpages.com.au/拒绝通过Jsoup发送的HTTP请求。

public class ReadURL {

    public static void main(String args[]) throws IOException {
        parseURL("http://www.yellowpages.com.au/search/listings?clue=butchers&locationClue=&lat=&lon=");

    } 
    public static void parseURL (String url) throws IOException {
         Document doc = Jsoup.connect(url).get();
             System.out.println(doc.toString());

}

<html>
 <head>
  <title>Request Rejected</title>
 </head>
 <body>
  The requested URL was rejected. Please consult with your administrator.
  <br />
  <br />Your support ID is: 5406139567541308211
 </body>
</html>

1 个答案:

答案 0 :(得分:3)

我只是尝试了一下,添加了用户代理并且工作正常:

public static void parseURL(String url) throws IOException {
    Document doc = Jsoup.connect(url)
            .userAgent("Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:26.0) Gecko/20100101 Firefox/26.0")
            .get();
    System.out.println(doc.toString());
}