我有几台小型服务器,它们可以稳定运行约2周,然后由于Metaspace上的OOM异常而开始崩溃。为什么会发生呢?为什么GC无法正确清洁Metaspace?服务器上的负载很低。
GC日志:
Java HotSpot(TM) 64-Bit Server VM (25.101-b13) for linux-amd64 JRE (1.8.0_101-b13), built on Jun 22 2016 02:59:44 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 8007052k(1853764k free), swap 4194300k(3926836k free)
CommandLine flags: -XX:+AlwaysPreTouch -XX:+CITime -XX:+CMSClassUnloadingEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:InitialHeapSize=3221225472 -XX:+ManagementServer -XX:MaxHeapSize=3221225472 -XX:MaxMetaspaceSize=209715200 -XX:MaxNewSize=2147483648 -XX:NewSize=2147483648 -XX:OldPLABSize=16 -XX:-OmitStackTraceInFastThrow -XX:+OptimizeStringConcat -XX:+PrintCommandLineFlags -XX:+PrintConcurrentLocks -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:-PrintTenuringDistribution -XX:SurvivorRatio=10 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseLargePages -XX:+UseParNewGC
2018-07-21T09:08:37.065+0300: 2.182: [GC (CMS Initial Mark) [1 CMS-initial-mark: 0K(1048576K)] 454408K(2971008K), 0.0796923 secs] [Times: user=0.10 sys=0.00, real=0.08 secs]
...
2018-08-06T19:36:36.399+0300: 1420080.148: [Full GC (Metadata GC Threshold) 2018-08-06T19:36:36.399+0300: 1420080.148: [CMS: 122326K->119501K(1048576K), 0.9307875 secs] 1738518K->119501K(2971008K), [Metaspace: 162731K->162731K(1230848K)], 0.9365048 secs] [Times: user=1.06 sys=0.00, real=0.93 secs]
2018-08-06T19:36:37.336+0300: 1420081.085: [Full GC (Last ditch collection) 2018-08-06T19:36:37.336+0300: 1420081.085: [CMS: 119501K->116331K(1048576K), 0.4379265 secs] 119501K->116331K(2971008K), [Metaspace: 162284K->162284K(1230848K)], 0.4383180 secs] [Times: user=0.76 sys=0.00, real=0.44 secs]
2018-08-06T19:36:37.777+0300: 1420081.526: [GC (CMS Initial Mark) [1 CMS-initial-mark: 116331K(1048576K)] 116340K(2971008K), 0.0028328 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
仅此而已,然后在崩溃之后重新启动服务器。
堆转储中的OOM信息
"C3P0PooledConnectionPoolManager[identityToken->mainDbPool, dataSourceName->mainDb]-HelperThread-#1" daemon prio=5 tid=33 RUNNABLE
at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:48)
at com.mchange.v2.c3p0.impl.NewPooledConnection.carefulCheckReadOnly(NewPooledConnection.java:171)
at com.mchange.v2.c3p0.impl.NewPooledConnection.<init>(NewPooledConnection.java:123)
Local Variable: com.mchange.v2.c3p0.impl.DefaultConnectionTester#1
Local Variable: com.mchange.v2.c3p0.impl.NewPooledConnection#52
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:240)
Local Variable: com.mchange.v2.c3p0.DriverManagerDataSource#1
Local Variable: org.postgresql.jdbc.PgConnection#57
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
Local Variable: com.mchange.v2.c3p0.WrapperConnectionPoolDataSource#2
Local Variable: java.lang.String#6941
Local Variable: java.lang.String#6943
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
Local Variable: com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager#1
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
at com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
Local Variable: com.mchange.v2.resourcepool.BasicResourcePool#1
at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
Local Variable: com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#1