读取HTML时的HTTP响应代码429

时间:2018-09-28 08:19:01

标签: java instagram http-status-codes rate-limiting http-status-code-429

在Java中,我想从URL(instagram)中读取并保存所有HTML,但是出现错误429(请求过多)。我认为这是因为我尝试读取的行数超出了请求限制。

StringBuilder contentBuilder = new StringBuilder();
try {
    URL url = new URL("https://www.instagram.com/username");
    URLConnection con = url.openConnection();
    InputStream is =con.getInputStream();
    BufferedReader in = new BufferedReader(new InputStreamReader(is));
    String str;
    while ((str = in.readLine()) != null) {
        contentBuilder.append(str);
    }
    in.close();
} catch (IOException e) {
    log.warn("Could not connect", e);
}
String html = contentBuilder.toString();

错误是这样;

Could not connect
java.io.IOException: Server returned HTTP response code: 429 for URL: https://www.instagram.com/username/

它还显示由于此行而发生错误

InputStream is =con.getInputStream();

有人知道我为什么收到此错误和/或解决该错误的方法吗?

1 个答案:

答案 0 :(得分:1)

该问题可能是由于未关闭/断开连接引起的。 对于自动关闭的try-with-resources输入,即使在异常或返回时也很有用。此外,您还构造了一个InputStreamReader,它将使用应用程序运行所在计算机的默认编码,但是您需要URL内容的字符集。 readLine返回不带行尾的行(通常非常有用)。因此,添加一个。

StringBuilder contentBuilder = new StringBuilder();
try {
    URL url = new URL("https://www.instagram.com/username");
    URLConnection con = url.openConnection();
    try (BufferedReader in = new BufferedReader(
                new InputStreamReader(con.getInputStream(), "UTF-8"))) {
        String line;
        while ((line = in.readLine()) != null) {
            contentBuilder.append(line).append("\r\n");
        }
    } finally {
        con.disconnect();
    } // Closes in.
} catch (IOException e) {
    log.warn("Could not connect", e);
}
String html = contentBuilder.toString();