通过URL下载图像

时间:2013-01-29 18:13:39

标签: java url network-programming download

我正在尝试下载此链接上的高分辨率产品图片

http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.show-product/American-furniture/3005-75310/spindle-back-side-chair---ebony.cfm

点击下载高分辨率照片,我可以轻松下载,但是当我尝试复制图片网址,然后从其他标签下载时,我得到了3005_75310 .jpg不存在。

所以我试图从第一个请求中看到请求标头并将其设置在我的URL java对象中,但是创建的文件是空的,是否有人有想法?

public static void saveImage(String imageUrl, String destinationFile) {
    URL url;
    try {
        url = new URL(imageUrl);
        URLConnection uc = url.openConnection();

        uc.setRequestProperty("Accept",
                "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
        uc.setRequestProperty("Accept-Charset",
                "ISO-8859-1,utf-8;q=0.7,*;q=0.3");
        uc.setRequestProperty("Accept-Encoding", "gzip,deflate,sdch");
        uc.setRequestProperty("Accept-Language", "en-US,en;q=0.8");
        uc.setRequestProperty("Connection", "keep-alive");

        uc.setRequestProperty(
                "Referer",
                "http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.show-product/American-furniture/3005-75310/spindle-back-side-chair---ebony.cfm");

        InputStream is = url.openStream();
        OutputStream os = new FileOutputStream(destinationFile);

        byte[] b = new byte[2048];
        int length;

        while ((length = is.read(b)) != -1) {
            os.write(b, 0, length);
        }

        is.close();
        os.close();
    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }

}

2 个答案:

答案 0 :(得分:0)

提供的推荐人不是网站编码器所期望的防止您正在执行的抓取的方法。示例工作请求:

$ wget \
  --referer=http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.show-product/American-furniture/3005-75310/spindle-back-side-chair---ebony.cfm \
  http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.photo-download/photo/3005_75310.jpg


Length: unspecified [image/jpeg]
Saving to: `3005_75310.jpg'

    [  <=>                                                                                ] 346,125      949K/s   in 0.4s

2013-01-29 13:24:02 (949 KB/s) - `3005_75310.jpg' saved [346125]

答案 1 :(得分:0)

对于它的价值,看起来唯一重要的标题是“Referer”标题:

这失败了:

curl "http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.photo-download/photo/3005_75310.jpg" > /test/3005_75310.jpg

这有效:

curl -H "Referer: http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.show-product/American-furniture/3005-75310/spindle-back-side-chair---ebony.cfm" "http://www.hookerfurniture.com/index.cfm/furniture/furniture-catalog.photo-download/photo/3005_75310.jpg" > /test/3005_75310.jpg

为了在Java中提取图像数据,我发现使用DataInputStream的readFully()方法取得了最大的成功。