So we have a medium-sized JVM-based application and since a week or two it's being OOM-killed regularly by docker. I have read everything I could find on Java 8 memory consumption in containers, the experimental cgroup
flag, MaxRAM, controlling non-heap size, optimizing the GC and so on. But there is no way to get the JVM to throw its own OOM exception in our case. It's always docker that's killing it with code 137.
E.g. when giving 2000M
of memory to the container and setting the heap to 80%
of that:
-XX:MaxRAM=1600M -XX:MaxRAMFraction=2
which means the heap will grow up to 800M, the result is still an OOM-kill by docker. We started out with -Xmx
between 800M
and 1600M
- same result.
When controlling the non-heap size (assuming a max of 100 threads):
-XX:MaxRAM=1050M -XX:MaxRAMFraction=1 -Xss1M -XX:ReservedCodeCacheSize=128M -XX:MaxDirectMemorySize=64M -XX:CompressedClassSpaceSize=128M -XX:MaxMetaspaceSize=128M
and arriving at (100 * Xss) + 128M + 64M + 128M + 128M = 548M
for the entire non-heap part of JVM memory requirements, we take the 2000M
of container memory minus a margin of 20%
minus the 548M
non-heap giving us -XX:MaxRAM=1050M
and still we get OOM-killed.
Not sure if it matters but we run a DC/OS cluster and it's Marathon reporting the task kills due to OOM. But my understanding is that it's the underlying docker engine's behaviour that gets reported.
答案 0 :(得分:0)
请检查您使用的Open JDK 8的版本。 Oracle将对cgroups限制的支持反向移植到Open JDK 8u131。请参阅this Oracle blog article和shorter explanation。后者提供了一些有用的代码片段,以检查JVM是否正确设置了容器内的堆大小。
如果在您的情况下,JVM正确设置了堆大小,那么我将检查应用程序中的内存泄漏。