EhCache + JGroups给出“刷新复制队列时出现异常:null”

时间:2012-02-10 13:28:52

标签: java replication ehcache jgroups

我正在尝试使用基于JGroups的复制来配置EhCache,但是只要第一个元素添加到缓存中,就会出现以下异常的日志:

12061 [Replication Thread] ERROR net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator - Exception on flushing of replication queue: null. Continuing...
java.lang.NullPointerException
    at net.sf.ehcache.distribution.RMISynchronousCacheReplicator.listRemoteCachePeers(RMISynchronousCacheReplicator.java:335)
    at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.flushReplicationQueue(RMIAsynchronousCacheReplicator.java:299)
    at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.replicationThreadMain(RMIAsynchronousCacheReplicator.java:119)
    at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator.access$100(RMIAsynchronousCacheReplicator.java:57)
    at net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator$ReplicationThread.run(RMIAsynchronousCacheReplicator.java:371)

ehcache.xml是这样的:

<?xml version="1.0" encoding="UTF-8"?>       
<ehcache 
  updateCheck="true" 
  monitoring="autodetect"
  defaultTransactionTimeoutInSeconds="30" 
  dynamicConfig="true">

  <cacheManagerPeerProviderFactory
    class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"
    properties="jgroups.xml"
  />

  <defaultCache 
    maxElementsInMemory="200"
    eternal="false"
    statistics="true"
    timeToIdleSeconds="86400"
    timeToLiveSeconds="86400"    
    overflowToDisk="false">    
    <cacheEventListenerFactory
      class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"
      properties="replicateAsynchronously=true, replicatePuts=true, replicateUpdates=true, replicateUpdatesViaCopy=true, replicateRemovals=true"
    />
    <bootstrapCacheLoaderFactory class="net.sf.ehcache.distribution.RMIBootstrapCacheLoaderFactory" />    
  </defaultCache>

</ehcache>

jgroups.xml是这样的:

<?xml version='1.0'?>
<config>
    <TCP start_port="7800" />
    <TCPPING 
       timeout="3000" 
       initial_hosts="localhost[7800],localhost[7800]"
       port_range="10" 
       num_initial_members="2" />
    <VERIFY_SUSPECT timeout="1500" />
    <pbcast.NAKACK 
       use_mcast_xmit="false"
       gc_lag="100"
       retransmit_timeout="300,600,1200,2400,4800"
       discard_delivered_msgs="true" />
    <pbcast.STABLE
       stability_delay="1000"
       desired_avg_gossip="50000"
       max_bytes="400000" />
    <pbcast.GMS
       print_local_addr="true"
       join_timeout="5000"
       shun="false"
       view_bundling="true" />
</config>

使用jgroups版本2.8.1.GA,ehcache-core版本2.5.1,ehcache-jgroupsreplication版本1.5。

我做错了什么?

更新:当我更改为replicateAsynchronously=false时,我收到以下异常:

Exception in thread "main" java.lang.NullPointerException
    at net.sf.ehcache.distribution.RMISynchronousCacheReplicator.listRemoteCachePeers(RMISynchronousCacheReplicator.java:335)
    at net.sf.ehcache.distribution.RMISynchronousCacheReplicator.replicatePutNotification(RMISynchronousCacheReplicator.java:145)
    at net.sf.ehcache.distribution.RMISynchronousCacheReplicator.notifyElementPut(RMISynchronousCacheReplicator.java:132)
    at net.sf.ehcache.event.RegisteredEventListeners.notifyListener(RegisteredEventListeners.java:294)
    at net.sf.ehcache.event.RegisteredEventListeners.invokeListener(RegisteredEventListeners.java:284)
    at net.sf.ehcache.event.RegisteredEventListeners.internalNotifyElementPut(RegisteredEventListeners.java:144)
    at net.sf.ehcache.event.RegisteredEventListeners.notifyElementPut(RegisteredEventListeners.java:122)
    at net.sf.ehcache.Cache.notifyPutInternalListeners(Cache.java:1515)
    at net.sf.ehcache.Cache.putInternal(Cache.java:1490)
    at net.sf.ehcache.Cache.put(Cache.java:1417)
    at net.sf.ehcache.Cache.put(Cache.java:1382)

更新2 :问题是在Terracota的JIRA中创建的:https://jira.terracotta.org/jira/browse/EHC-927

2 个答案:

答案 0 :(得分:2)

正如Chris在EHC927中指出的那样,我使用了错误的cacheEventListenerFactory类。它应该是net.sf.ehcache.distribution.jgroups.JGroupsCacheReplicatorFactory而不是net.sf.ehcache.distribution.RMICacheReplicatorFactory

答案 1 :(得分:1)

我已经检查了RMIAsynchronousCacheReplicator类的源代码

http://www.jarvana.com/jarvana/view/net/sf/ehcache/ehcache-core/2.1.0/ehcache-core-2.1.0-sources.jar!/net/sf/ehcache/distribution/RMIAsynchronousCacheReplicator.java?format=ok

调用flushReplicationQueue()时有些不对劲;它还应检查replicationQueue != null,而不只是replicationQueue.size() == 0。就像它在while循环中测试线程的alive()一样......

如果对象不存在或未初始化,则无法刷新对象...如果对象不存在或者未初始化,它怎么知道对象是否为空?简单地捕捉NullPointerException并不是告诉用户它的好方法!

/**
 * RemoteDebugger method for the replicationQueue thread.
 * <p/>
 * Note that the replicationQueue thread locks the cache for the entire time it is writing elements to the disk.
 */
private void replicationThreadMain() {
    while (true) {
        // Wait for elements in the replicationQueue
        while (alive() && replicationQueue != null && replicationQueue.size() == 0) {
            try {
                Thread.sleep(asynchronousReplicationInterval);
            } catch (InterruptedException e) {
                LOG.debug("Spool Thread interrupted.");
                return;
            }
        }
        if (notAlive()) {
            return;
        }
        try {
            if (replicationQueue.size() != 0) {
                flushReplicationQueue();
            }
        } catch (Throwable e) {
            LOG.error("Exception on flushing of replication queue: " + e.getMessage() + ". Continuing...", e);
        }
    }
}

代码的意图只是为了避免CPU空闲时间跳转到50%,当线程在while循环中什么也不做时,如果CPU使用量增加50%左右,它可能会导致用户认为某些东西不适合Encache一直......

可能需要使用较小的值(100毫秒到150毫秒)添加属性asynchronousReplicationInterval,以便可以构建复制队列。附加如下:

properties="replicateAsynchronously=true, 
replicatePuts=true, 
replicateUpdates=true, 
replicateUpdatesViaCopy=true, 
replicateRemovals=true, 
asynchronousReplicationInterval=100"

下面的RMIAsynchronousCacheReplicator构造函数中可能需要它:

/**
 * Constructor for internal and subclass use
 */
public RMIAsynchronousCacheReplicator(
        boolean replicatePuts,
        boolean replicatePutsViaCopy,
        boolean replicateUpdates,
        boolean replicateUpdatesViaCopy,
        boolean replicateRemovals,
        int asynchronousReplicationInterval) {
    super(replicatePuts,
            replicatePutsViaCopy,
            replicateUpdates,
            replicateUpdatesViaCopy,
            replicateRemovals);
    this.asynchronousReplicationInterval = asynchronousReplicationInterval;
    status = Status.STATUS_ALIVE;
    replicationThread.start();
}

也许,你可以暂时忽略这个问题,让其他人报告错误,如果它甚至被认为是一个错误...我想知道为什么它会说“继续......”以后......