Question

示例页面：http://www.amazon.com/gp/offer-listing/1589942140

public void connect( String url ) {        
    this.conn = Jsoup.connect( url );  
}

/**
 * Executes the request and parses the result.
 * @return 
 */
public boolean parse() 
{
    try {
        this.page = this.conn.get();
        return true;
    } catch (IOException ex) {
        // log it here
        System.out.format("Error: %s%n", ex);
        return false;
    }
}

解析页面会在下面创建ioexception：

org.jsoup.HttpStatusException：HTTP错误提取URL。状态= 204，网址= http://www.amazon.com/gp/offer-listing/1589942140

我尝试使用下面的本机java url类，但它没有创建IOException：

    try {
        URL myURL = new URL("http://www.amazon.com/gp/offer-listing/1589942140");
        URLConnection myURLConnection = myURL.openConnection();
        myURLConnection.connect();
        System.out.format("%s", myURLConnection.getContentType());
    } 
    catch (MalformedURLException e) { 
        // new URL() failed
        System.out.format("Error: %s%n", e);
    } 
    catch (IOException e) {   
        // openConnection() failed
        System.out.format("Error: %s%n", e);
    }

任何想法为什么会这样？

Answer 1

以下适用于我：

            System.out.println(Jsoup.connect("http://www.amazon.com/gp/offer-listing/1589942140").userAgent("Mozilla").get().text());;

上面尝试的网址是您在上面指定的。（示例页面：http://www.amazon.com/gp/offer-listing/1589942140）

使用jsoup解析亚马逊页面返回204状态

1 个答案: