Question

我在多线程环境中聚合键的多个值。钥匙事先不知道。我以为我会做这样的事情：

class Aggregator {
    protected ConcurrentHashMap<String, List<String>> entries =
                            new ConcurrentHashMap<String, List<String>>();
    public Aggregator() {}

    public void record(String key, String value) {
        List<String> newList =
                    Collections.synchronizedList(new ArrayList<String>());
        List<String> existingList = entries.putIfAbsent(key, newList);
        List<String> values = existingList == null ? newList : existingList;
        values.add(value);
    }
}

我看到的问题是，每次运行此方法时，我都需要创建一个ArrayList的新实例，然后将其丢弃（在大多数情况下）。这似乎是无理滥用垃圾收集器。是否有一种更好的，线程安全的方法来初始化这种结构而无需synchronize record方法？我对使putIfAbsent方法不返回新创建的元素的决定感到有些惊讶，并且缺少一种延迟实例化的方法，除非它被调用（可以这么说）。

Answer 1

Java 8引入了一个API来满足这个确切的问题，制作了一个解决方案：

public void record(String key, String value) {
    entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())).add(value);
}

对于Java 7：

public void record(String key, String value) {
    List<String> values = entries.get(key);
    if (values == null) {
        entries.putIfAbsent(key, Collections.synchronizedList(new ArrayList<String>()));
        // At this point, there will definitely be a list for the key.
        // We don't know or care which thread's new object is in there, so:
        values = entries.get(key);
    }
    values.add(value);
}

这是填充ConcurrentHashMap时的标准代码模式。

特殊方法putIfAbsent(K, V))将您的值对象放入，或者如果另一个线程在您之前，则它将忽略您的值对象。无论哪种方式，在调用putIfAbsent(K, V))之后，get(key)保证在线程之间保持一致，因此上面的代码是线程安全的。

唯一浪费的开销是，如果某个其他线程同时为同一个密钥添加一个新条目：您可能最终丢弃新创建的值，但只有在存在时才会发生还没有一个条目和你的线程会失败，这通常很少见。

Answer 2

从Java-8开始，您可以使用以下模式创建多地图：

public void record(String key, String value) { entries.computeIfAbsent(key, k -> Collections.synchronizedList(new ArrayList<String>())) .add(value); }

ConcurrentHashMap文档（不是常规协定）指定只为每个键创建一次ArrayList，在为新键创建ArrayList的同时延迟更新的初始成本很低：

http://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#computeIfAbsent-K-java.util.function.Function-

Answer 3

最后，我对@ Bohemian的答案稍作修改。他建议的解决方案用values调用覆盖putIfAbsent变量，这会产生我以前遇到的同样问题。似乎工作的代码如下所示：

    public void record(String key, String value) {
        List<String> values = entries.get(key);
        if (values == null) {
            values = Collections.synchronizedList(new ArrayList<String>());
            List<String> values2 = entries.putIfAbsent(key, values);
            if (values2 != null)
                values = values2;
        }
        values.add(value);
    }

它并不像我想的那么优雅，但它比每次调用时创建一个新ArrayList实例的原始文件更好。

Answer 4

根据Gene的答案创建了两个版本

public  static <K,V> void putIfAbsetMultiValue(ConcurrentHashMap<K,List<V>> entries, K key, V value) {
    List<V> values = entries.get(key);
    if (values == null) {
        values = Collections.synchronizedList(new ArrayList<V>());
        List<V> values2 = entries.putIfAbsent(key, values);
        if (values2 != null)
            values = values2;
    }
    values.add(value);
}

public  static <K,V> void putIfAbsetMultiValueSet(ConcurrentMap<K,Set<V>> entries, K key, V value) {
    Set<V> values = entries.get(key);
    if (values == null) {
        values = Collections.synchronizedSet(new HashSet<V>());
        Set<V> values2 = entries.putIfAbsent(key, values);
        if (values2 != null)
            values = values2;
    }
    values.add(value);
}

效果很好

Answer 5

这是一个问题我也在寻找答案。方法putIfAbsent实际上并不解决额外的对象创建问题，它只是确保其中一个对象不替换另一个对象。但是线程之间的竞争条件可能导致多个对象实例化。我可以找到3个解决这个问题的方法（我会遵循这个优先顺序）：

1-如果您使用的是Java 8，实现此目的的最佳方法可能是computeIfAbsent的新ConcurrentMap方法。您只需要给它一个将同步执行的计算函数（至少对于ConcurrentHashMap实现）。例如：

private final ConcurrentMap<String, List<String>> entries =
        new ConcurrentHashMap<String, List<String>>();

public void method1(String key, String value) {
    entries.computeIfAbsent(key, s -> new ArrayList<String>())
            .add(value);
}

这是来自ConcurrentHashMap.computeIfAbsent：

的javadoc

如果指定的键尚未与值关联，请尝试使用给定的映射函数计算其值并输入它进入此地图，除非null。执行整个方法调用原子地，因此每个键最多应用一次该函数。一些其他线程可能会在此映射上尝试更新操作计算正在进行时被阻止，因此计算应该是简短，并且不得尝试更新任何其他映射这张地图。

2-如果您不能使用Java 8，则可以使用Guava的{{1}}，这是线程安全的。你可以为它定义一个加载函数（就像上面的LoadingCache函数一样），你可以确定它将被同步调用。例如：

compute

3-如果您也不能使用Guava，您可以随时手动同步并进行双重检查锁定。例如：

private final LoadingCache<String, List<String>> entries = CacheBuilder.newBuilder()
        .build(new CacheLoader<String, List<String>>() {
            @Override
            public List<String> load(String s) throws Exception {
                return new ArrayList<String>();
            }
        });

public void method2(String key, String value) {
    entries.getUnchecked(key).add(value);
}

我做了所有这3个方法的示例实现，另外还有非同步方法，这会导致额外的对象创建：http://pastebin.com/qZ4DUjTr

Answer 6

使用Java 1.7.40处理空数组列表创建问题的内存浪费（也是GC等）。不要担心创建空的arraylist。参考：http://javarevisited.blogspot.com.tr/2014/07/java-optimization-empty-arraylist-and-Hashmap-cost-less-memory-jdk-17040-update.html

Answer 7

使用putIfAbsent的方法具有最快的执行时间，在具有高争用的环境中比“lambda”方法快2到50倍。 Lambda不是这个“powerloss”背后的原因，问题是在Java-9优化之前computeIfAbsent内部的强制同步。

基准：

import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
import java.util.concurrent.atomic.AtomicLong;

public class ConcurrentHashMapTest {
    private final static int numberOfRuns = 1000000;
    private final static int numberOfThreads = Runtime.getRuntime().availableProcessors();
    private final static int keysSize = 10;
    private final static String[] strings = new String[keysSize];
    static {
        for (int n = 0; n < keysSize; n++) {
            strings[n] = "" + (char) ('A' + n);
        }
    }

    public static void main(String[] args) throws InterruptedException {
        for (int n = 0; n < 20; n++) {
            testPutIfAbsent();
            testComputeIfAbsentLamda();
        }
    }

    private static void testPutIfAbsent() throws InterruptedException {
        final AtomicLong totalTime = new AtomicLong();
        final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
        final Random random = new Random();
        ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);

        for (int i = 0; i < numberOfThreads; i++) {
            executorService.execute(new Runnable() {
                @Override
                public void run() {
                    long start, end;
                    for (int n = 0; n < numberOfRuns; n++) {
                        String s = strings[random.nextInt(strings.length)];
                        start = System.nanoTime();

                        AtomicInteger count = map.get(s);
                        if (count == null) {
                            count = new AtomicInteger(0);
                            AtomicInteger prevCount = map.putIfAbsent(s, count);
                            if (prevCount != null) {
                                count = prevCount;
                            }
                        }
                        count.incrementAndGet();
                        end = System.nanoTime();
                        totalTime.addAndGet(end - start);
                    }
                }
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
        System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
                + " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
    }

    private static void testComputeIfAbsentLamda() throws InterruptedException {
        final AtomicLong totalTime = new AtomicLong();
        final ConcurrentHashMap<String, AtomicInteger> map = new ConcurrentHashMap<String, AtomicInteger>();
        final Random random = new Random();
        ExecutorService executorService = Executors.newFixedThreadPool(numberOfThreads);
        for (int i = 0; i < numberOfThreads; i++) {
            executorService.execute(new Runnable() {
                @Override
                public void run() {
                    long start, end;
                    for (int n = 0; n < numberOfRuns; n++) {
                        String s = strings[random.nextInt(strings.length)];
                        start = System.nanoTime();

                        AtomicInteger count = map.computeIfAbsent(s, (k) -> new AtomicInteger(0));
                        count.incrementAndGet();

                        end = System.nanoTime();
                        totalTime.addAndGet(end - start);
                    }
                }
            });
        }
        executorService.shutdown();
        executorService.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
        System.out.println("Test " + Thread.currentThread().getStackTrace()[1].getMethodName()
                + " average time per run: " + (double) totalTime.get() / numberOfThreads / numberOfRuns + " ns");
    }

}

结果：

Test testPutIfAbsent average time per run: 115.756501 ns
Test testComputeIfAbsentLamda average time per run: 276.9667055 ns
Test testPutIfAbsent average time per run: 134.2332435 ns
Test testComputeIfAbsentLamda average time per run: 223.222063625 ns
Test testPutIfAbsent average time per run: 119.968893625 ns
Test testComputeIfAbsentLamda average time per run: 216.707419875 ns
Test testPutIfAbsent average time per run: 116.173902375 ns
Test testComputeIfAbsentLamda average time per run: 215.632467375 ns
Test testPutIfAbsent average time per run: 112.21422775 ns
Test testComputeIfAbsentLamda average time per run: 210.29563725 ns
Test testPutIfAbsent average time per run: 120.50643475 ns
Test testComputeIfAbsentLamda average time per run: 200.79536475 ns

ConcurrentHashMap：使用“putIfAbsent”避免额外的对象创建？

7 个答案: