下载大文件(大于250 MB)时,GC逐渐增加

时间:2018-11-04 17:42:49

标签: java http hadoop url garbage-collection

我有一个用于从URL下载JSON输出并将其存储在Hadoop文件系统(文件大小超过250 mb)中的应用程序。下面的代码片段正确下载了文件并将文件存储在Hadoop文件系统中。但是GC的峰值逐渐超过10秒。如果我使用文件而不是URL,它将文件存储在hadoop中。不确定为什么GC仅对HTTPURLConnection发出警告。

BufferedReader is = null;
    URL connectionUrl=<sample url>
    HttpURLConnection connection = (HttpURLConnection) connectionUrl.openConnection();
    connection.setRequestMethod("GET");
    connection.setUseCaches(false);
    connection.setRequestProperty("Accept", "application/json");
    connection.setConnectTimeout(5000);
    connection.connect();
if (connection.getResponseCode() == HttpsURLConnection.HTTP_OK)
{
    System.out.println(" http connection established successfully");
}
else
{
    System.out.println("Connection to URL  failed with response code:");
}
try
{
    is = new BufferedReader(new InputStreamReader(connection.getInputStream(), StandardCharsets.UTF_8));

int bytesRead;
char[] bufferSize = new char[5 * 1024 * 1024];
while ((bytesRead = inputstream.read(bufferSize)) != -1)
{
    byte[] array = new String(bufferSize).getBytes();
    out.write(array, 0, bytesRead);
    //FSDataOuputStream from hadoop api to persist the file to storage.
    out.hsync();
}
  }

    finally
    {
        if(is != null)
        {
            is.close();
        }
        if(connection != null)
        {
            connection.disconnect();
            connection = null;
        }
        if(out != null)
        {
            out.close();
        }
    }

我正在使用hsync将其立即存储到文件系统中,以避免JVM中发生OOM错误。我使用5MB作为缓冲区大小,即使我尝试将缓冲区大小降低到4 KB。 GC峰值仍在逐渐发生。

GC日志:

    2018-11-04T18:37:12.215+0100: 928.845: [GC (Allocation Failure) 2018-11-04T18:37:12.215+0100: 928.845: [ParNew
    Desired survivor size 69664768 bytes, new threshold 6 (max 6)
    - age   1:     195144 bytes,     195144 total
    - age   2:     128160 bytes,     323304 total
    - age   3:     127712 bytes,     451016 total
    - age   4:     121688 bytes,     572704 total
    - age   5:     120536 bytes,     693240 total
    - age   6:     118248 bytes,     811488 total
    : 1090265K->1023K(1224832K), 0.2579451 secs] 4588219K->3499091K(8029312K), 0.2584335 secs] [Times: user=1.02 sys=0.00, real=0.26 secs]
    2018-11-04T18:38:11.090+0100: 987.720: [GC (Allocation Failure) 2018-11-04T18:38:11.090+0100: 987.720: [ParNew
    Desired survivor size 69664768 bytes, new threshold 6 (max 6)
    - age   1:     194016 bytes,     194016 total
    - age   2:     123784 bytes,     317800 total
    - age   3:     123728 bytes,     441528 total
    - age   4:     125544 bytes,     567072 total
    - age   5:     121048 bytes,     688120 total
    - age   6:     120208 bytes,     808328 total
    : 1089791K->1082K(1224832K), 0.3944354 secs] 4587859K->3499264K(8029312K), 0.3949189 secs] [Times: user=1.31 sys=0.00, real=0.40 secs]
    2018-11-04T18:39:10.393+0100: 1047.023: [GC (Allocation Failure) 2018-11-04T18:39:10.394+0100: 1047.024: [ParNew
    Desired survivor size 69664768 bytes, new threshold 6 (max 6)
    - age   1:     192424 bytes,     192424 total
    - age   2:     124584 bytes,     317008 total
    - age   3:     119288 bytes,     436296 total
    - age   4:     121336 bytes,     557632 total
    - age   5:     124616 bytes,     682248 total
    - age   6:     120152 bytes,     802400 total
    : 1089850K->1351K(1224832K), 0.3512605 secs] 4588032K->3499651K(8029312K), 0.3517526 secs] [Times: user=1.39 sys=0.00, real=0.35 secs]
2018-11-04T18:40:08.942+0100: 1105.572: [GC (Allocation Failure) 2018-11-04T18:40:08.942+0100: 1105.572: [ParNew
Desired survivor size 69664768 bytes, new threshold 6 (max 6)
- age   1:     254416 bytes,     254416 total
- age   2:     127632 bytes,     382048 total
- age   3:     119896 bytes,     501944 total
- age   4:     117304 bytes,     619248 total
- age   5:     120536 bytes,     739784 total
- age   6:     123168 bytes,     862952 total
: 1090119K->1155K(1224832K), 0.5644555 secs] 4588419K->3499572K(8029312K), 0.5649258 secs] [Times: user=2.24 sys=0.00, real=0.56 secs]
2018-11-04T18:41:08.304+0100: 1164.934: [GC (Allocation Failure) 2018-11-04T18:41:08.304+0100: 1164.934: [ParNew
Desired survivor size 69664768 bytes, new threshold 6 (max 6)
- age   1:     192576 bytes,     192576 total
- age   2:     125192 bytes,     317768 total
- age   3:     122584 bytes,     440352 total
- age   4:     117696 bytes,     558048 total
- age   5:     116792 bytes,     674840 total
- age   6:     120280 bytes,     795120 total
: 1089923K->1240K(1224832K), 0.7697664 secs] 4588340K->3499777K(8029312K), 0.7702585 secs] [Times: user=3.06 sys=0.00, real=0.77 secs]
2018-11-04T18:42:08.677+0100: 1225.307: [GC (Allocation Failure) 2018-11-04T18:42:08.677+0100: 1225.307: [ParNew
Desired survivor size 69664768 bytes, new threshold 6 (max 6)
- age   1:     200880 bytes,     200880 total
- age   2:     127888 bytes,     328768 total
- age   3:     120464 bytes,     449232 total
- age   4:     120608 bytes,     569840 total
- age   5:     116992 bytes,     686832 total
- age   6:     115840 bytes,     802672 total
: 1090008K->1330K(1224832K), 1.0533365 secs] 4588545K->3499984K(8029312K), 1.0538119 secs] [Times: user=4.19 sys=0.00, real=1.05 secs]

现在一段时间之后,GC花费了0.3秒。

0 个答案:

没有答案