EhCache复制期间的OOM

时间:2012-09-06 12:44:22

标签: jboss ehcache jgroups

我们在EhCache复制使用期间遇到了OOM问题。 内存转储向我们显示了顶部的jgroup相关对象:

Instance Counts for All Classes (excluding platform)
464012 instances of class org.jgroups.util.Headers
463718 instances of class org.jgroups.protocols.pbcast.NakAckHeader
463512 instances of class [Lorg.jgroups.Header;
462136 instances of class org.jgroups.Message
173509 instances of class org.jgroups.protocols.TpHeader
63301 instances of class com.mongodb.BasicDBObject

我们还在日志中看到以下警告:

2012-08-26 02:05:50,980 INFO  [org.jgroups.JChannel] (main) JGroups version: 
2.10.0.GA
2012-08-26 02:05:51,569 WARN  [org.jgroups.stack.Configurator] (main) TCPPING property down_thread was deprecated and is ignored
2012-08-26 02:05:51,569 WARN  [org.jgroups.stack.Configurator] (main) TCPPING property up_thread was deprecated and is ignored
2012-08-26 02:05:51,576 WARN  [org.jgroups.stack.Configurator] (main) VERIFY_SUSPECT property down_thread was deprecated and is ignored
2012-08-26 02:05:51,576 WARN  [org.jgroups.stack.Configurator] (main) VERIFY_SUSPECT property up_thread was deprecated and is ignored
2012-08-26 02:05:51,584 WARN  [org.jgroups.stack.Configurator] (main) NAKACK property down_thread was deprecated and is ignored
2012-08-26 02:05:51,584 WARN  [org.jgroups.stack.Configurator] (main) NAKACK property up_thread was deprecated and is ignored
2012-08-26 02:05:51,629 WARN  [org.jgroups.stack.Configurator] (main) GMS property join_retry_timeout was deprecated and is ignored
2012-08-26 02:05:51,629 WARN  [org.jgroups.stack.Configurator] (main) GMS property shun was deprecated and is ignored
2012-08-26 02:05:51,629 WARN  [org.jgroups.stack.Configurator] (main) GMS property down_thread was deprecated and is ignored
2012-08-26 02:05:51,629 WARN  [org.jgroups.stack.Configurator] (main) GMS property up_thread was deprecated and is ignored
2012-08-26 02:05:51,734 WARN  [org.jgroups.protocols.pbcast.NAKACK] (main) use_mcast_xmit should not be used because the transport (TCP) does not support IP multicasting; setting use_mcast_xmit to false
2012-08-26 02:05:58,539 WARN  [org.jgroups.protocols.pbcast.GMS] (main)
join(host_x-17490) sent to host_x-5955 timed out (after 5000 ms), retrying
2012-08-26 02:06:01,601 INFO
[net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProvider] (main) JGroups Replication started for 'EH_CACHE'. JChannel: local_addr=host_x-17490
cluster_name=EH_CACHE
my_view=[host_x-17490|0] [host_x-17490]

环境:

CentOS release 5.4 (Final)
JBboss-4.2.3 GA
Java: 1.6.0_21
RAM: 8 Gb
Hosts (machines): host_x, host_y

我们使用的Lib版本:

jgroups-2.10.0.GA.jar
ehcache-jgroupsreplication-1.5.jar
ehcache-core-2.5.0.jar

配置EhCache(ehcache.xml):

<ehcache>
    <cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"
                properties="connect=TCP(bind_port=7800):
                    TCPPING(initial_hosts=host_x[7800],host_y[7800];port_range=5;timeout=3000;
                    num_initial_members=3;up_thread=true;down_thread=true):
                    VERIFY_SUSPECT(timeout=1500;down_thread=false;up_thread=false):
                    pbcast.NAKACK(down_thread=true;up_thread=true;gc_lag=100;retransmit_timeout=3000):
                    pbcast.GMS(join_timeout=5000;join_retry_timeout=2000;shun=false;
                    print_local_addr=false;down_thread=true;up_thread=true)"
                propertySeparator="::" />

    <cache name="RECORD_CACHE" maxElementsInMemory="25000" eternal="false"
           overflowToDisk="false" memoryStoreEvictionPolicy="LFU" timeToLiveSeconds="900" >
        <cacheEventListenerFactory
                class="net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory"
                properties="replicateAsynchronously=true, replicatePuts=false, replicateUpdates=false,
                    replicateUpdatesViaCopy=false, replicateRemovals=true" />
    </cache>
</ehcache>

我们已检查host_y上的7800端口是否可从host_y获得,反之亦然(通过telnet)。

如果OOM问题,请你帮我们检测root吗? 我们对复制的错误配置有一些假设 - 但目前无法定义错误的位置。

感谢您提出任何建议或建议!

1 个答案:

答案 0 :(得分:2)

您的JGroups配置已完全关闭!

首先,它可能是从一个非常旧的版本复制而来的。其次,缺少STABLE,这意味着消息永远不会被垃圾收集!我建议使用2.10版JGroups中的tcp.xml或udp.xml(你正在使用)。