Question

我使用 FileUtils.copyURLToFile(URL, File) （Apache Commons IO 2.4部分）在我的计算机上下载并保存文件。问题是某些网站在没有引荐来源和用户代理数据的情况下拒绝连接。

我的问题：

有没有办法为InputStream方法指定用户代理和引荐来源？
或者我应该使用其他方法下载文件，然后将给定的file://保存到文件中？

Answer 1

我已使用HttpComponents而非Commons-IO重新实施该功能。此代码允许您根据URL下载Java文件并将其保存在特定目标位置。

最终代码：

public static boolean saveFile(URL imgURL, String imgSavePath) {

    boolean isSucceed = true;

    CloseableHttpClient httpClient = HttpClients.createDefault();

    HttpGet httpGet = new HttpGet(imgURL.toString());
    httpGet.addHeader("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.11 Safari/537.36");
    httpGet.addHeader("Referer", "https://www.google.com");

    try {
        CloseableHttpResponse httpResponse = httpClient.execute(httpGet);
        HttpEntity imageEntity = httpResponse.getEntity();

        if (imageEntity != null) {
            FileUtils.copyInputStreamToFile(imageEntity.getContent(), new File(imgSavePath));
        }

    } catch (IOException e) {
        isSucceed = false;
    }

    httpGet.releaseConnection();

    return isSucceed;
}

当然，上面的代码占用的空间比单行代码要多：

FileUtils.copyURLToFile(imgURL, new File(imgSavePath),
                        URLS_FETCH_TIMEOUT, URLS_FETCH_TIMEOUT);

但它可以让您更好地控制流程，并且不仅可以指定超时，还可以指定User-Agent和Referer值，这对许多网站都很重要。

Answer 2

可能不会，除非你能掌握打开网址的基础机制。

我建议使用https://hc.apache.org/库。这有很多关于标题等的功能。

Answer 3

完成已接受的关于如何处理超时的答案：

如果您想设置超时，您必须像这样创建 CloseableHttpClient：

RequestConfig config = RequestConfig.custom()
                 .setConnectTimeout(connectionTimeout)
                 .setConnectionRequestTimeout(readDataTimeout)
                 .setSocketTimeout(readDataTimeout)
                 .build();

CloseableHttpClient httpClient = HttpClientBuilder
                 .create()
                 .setDefaultRequestConfig(config)
                 .build();

而且，使用 try-with-resource 语句创建您的 CloseableHttpClient 来处理其关闭可能是个好主意：

try (CloseableHttpClient httpClient = HttpClientBuilder.create().setDefaultRequestConfig(config).build()) {
  ... rest of the code using httpClient
}

如何在FileUtils.copyURLToFile（URL，File）方法中指定用户代理和引用？

3 个答案: