Question

背景

我有一个大的数据映射（HashMap），保存在内存中，并通过后台线程进行增量更新（基于传入消息）：

<KEY> => <VALUE>
...

然后，最终用户将通过REST API查询它：

GET /lookup?key=<KEY>

更新不会立即应用，而是在收到特殊控制消息（即

）后立即应用

MESSAGE: "Add A" 

A=<VALUE>   //Not visible yet

MESSAGE: "Add B"

B=<VALUE>   //Not visible yet

MESSAGE: "Commit"

//Updates are now visible to the end-users
A=<VALUE>
B=<VALUE

我设计的架构如下：

volatile Map passiveCopy = new HashMap();
volatile Map activeCopy = new HashMap();

Map<String,Object> pendingUpdates; 

//Interactive requests (REST API)
Object lookup(String key) {
     activeCopy.get(key);
}

//Background thread processing the incoming messages.
//Messages are processed strictly sequentially
//i.e. no other message will be processed, until
//current handleMessage() invocation is completed
//(that is guaranteed by the message processing framework itself)
void handleMessage(Message msg) {

   //New updates go to the pending updates temporary map
   if(msg.type() == ADD) {
      pendingUpdates.put(msg.getKey(),msg.getValue()); 
   }


   if(msg.type() == COMMIT) {     
      //Apply updates to the passive copy of the map
      passiveCopy.addAll(pendingUpdates);

      //Swap active and passive map copies
      Map old = activeCopy; 
      activeCopy = passiveCopy;
      passiveCopy = old;

      //Grace period, wait for on-the-air requests to complete
      //REST API has a hard timeout of 100ms, so no client
      //will wait for the response longer than that 
      Thread.sleep(1000);

      //Re-apply updates to the now-passive (ex-active) copy of the map
      passiveCopy.addAll(pendingUpdates);

      //Reset the pendingUpdates map
      pendingUpdates.clear();
   }

}

问题

将write-> read写入volatile字段会在边缘之前发生：

在随后每次对该字段进行读取之前，都会对易失字段（第8.3.1.4节）进行写操作。

https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5

，并且正确选择了宽限期，我希望（通过putAll（））应用到PassiveCopy的所有更新都将变为可见交换后向最终用户的请求发送（全部一次）。

这确实是一个案例，还是有任何极端情况会使该方法失败？

注意

我知道创建Map的副本（以便每次都将一个新的Map实例分配给activeCopy）是安全的，但是我不想这样做（因为它确实很大）

Answer 1

除了您不一致使用activeMap和activeCopy（只需删除activeCopy并仅在activeMap和passiveCopy之间交换）之外，您的方法是明智的。

This answer引用了JLS：

如果x和y是同一线程的动作，并且x在y中位于y之前程序顺序，然后是hb（x，y）[x“在y之前发生”。

this answer中也给出了一个示例。

据此，我对易失性变量/字段的访问基本上是序列点；在您的情况下，由于交换是在程序代码中对地图进行修改之后，因此应确保在对的访问权限之前完成对地图的修改volatile字段实际上是执行的。所以这里没有比赛条件。

但是，在大多数情况下，您应该使用synchronized或显式锁来同步并发执行。围绕这些使用代码进行编码的唯一原因是，如果您需要高性能，即大规模并行性，要么线程无法阻止锁，要么所需的并行性太高以至于线程开始挨饿。

也就是说，我相信您真的应该“投资”适当的互斥，最好使用ReadWriteLock。由于synchronized（内部由ReadWriteLock使用）意味着存在内存障碍，因此您不再需要volatile。

例如：

final ReentrantReadWriteLock rwLock = new ReentrantReadWriteLock();
final Lock readLock = rwLock.getReadLock();
final Lock writeLock = rwLock.getWriteLock();

Map passiveCopy = new HashMap();
Map activeMap = new HashMap();

final Map<String,Object> pendingUpdates = new HashMap(); 

//Interactive requests (REST API)
Object lookup(String key) {

  readLock.lock();
  try {
     return activeMap.get(key);
  } finally {
    readLock.unlock();
  }
}

//Background thread processing the incoming messages.
//Messages are processed strictly sequentially
//i.e. no other message will be processed, until
//current handleMessage() invocation is completed
//(that is guaranteed by the message processing framework itself)
void handleMessage(Message msg) {

   //New updates go to the pending updates temporary map
   if(msg.type() == ADD) {
      pendingUpdates.put(msg.getKey(),msg.getValue()); 
   }


   if(msg.type() == COMMIT) {     
      //Apply updates to the passive copy of the map
      passiveCopy.addAll(pendingUpdates);

      final Map tempMap = passiveCopy;    

      writeLock.lock();

      try {
        passiveCopy = activeMap;
        activeMap = tempMap;
      } finally {
        writeLock.unlock();
      }

      // Update the now-passive copy to the same state as the active map:
      passiveCopy.addAll(pendingUpdates);
      pendingUpdates.clear();
   }

}

但是，从您的代码中，我读到“读者”在其“生存期”中应该看到地图的一致版本，上述代码无法保证，例如，如果单个“读者”两次访问该地图，可能会看到两个不同的地图。这可以通过让每个读取器在第一次访问地图之前获取读取锁本身，在最后一次访问地图之后释放读取锁来解决。在您的情况下，这可能会奏效，也可能不会奏效，因为如果读者长时间持有该锁，或者读者线程很多，它可能会阻塞/饿死试图提交更新的作家线程。

Answer 2

如果您需要原子添加新条目，那么易失性Map将会是一个问题，这样用户将永远不会看到不是所有条目都被添加，而是仅其中一些条目被添加的状态。

问题在于，在Java中 volatile引用只能确保以下几点：

可以保证，引用始终是最新的，并且所有更改都可以在任何线程中看到
不能保证所引用对象的内容始终是最新的

（位于this book中）

我还检查了HashMap类的实现（假设您使用的是HashMap），在其中您可以看到方法putAll（Map）只是调用方法putMapEntries（Map，boolean），该方法的实现方式如下：

/**
 * Implements Map.putAll and Map constructor
 *
 * @param m the map
 * @param evict false when initially constructing this map, else
 * true (relayed to method afterNodeInsertion).
 */
final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) {
    int s = m.size();
    if (s > 0) {
        if (table == null) { // pre-size
            float ft = ((float)s / loadFactor) + 1.0F;
            int t = ((ft < (float)MAXIMUM_CAPACITY) ?
                     (int)ft : MAXIMUM_CAPACITY);
            if (t > threshold)
                threshold = tableSizeFor(t);
        }
        else if (s > threshold)
            resize();
        for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) {
            K key = e.getKey();
            V value = e.getValue();
            putVal(hash(key), key, value, false, evict);
        }
    }
}

因此您看到该方法仅在for循环（不是原子更新）中调用方法putVal（int，K，V，boolean，boolean）。这意味着使用putAll（Map）添加所有条目和使用for循环使用put（K，V）逐个添加条目之间没有真正的区别。

结论： 如果需要确保在任何情况下，用户都无法阅读仅添加了一些新元素而没有添加 volatile> 的地图。因此（如您已经提到的）创建地图的副本并进行交换会更好（并保存）。尽管它使用的内存是原来的两倍，但它会更快，因为volatile变量通常真的很慢。

用volatile更新和交换HashMap

背景

问题

2 个答案: