如何有效地将Java中的两个TreeMap相加?

时间:2015-01-08 13:12:16

标签: java hadoop treemap

我有多个TreeMap,我只想在同一个键的一个TreeMap求和值和有效中求和。 像:

   TreeMap<String,Long> sum(TreeMap<String,Long> tm1,TreeMap<String,Long> tm2);

我试图做到这一点,但是1.我无法将列表再次转换为TreeMap,如果等于,则密钥重复:

    TreeMap<String,Long> tm1=new TreeMap<String, Long>();
    ...
    TreeMap<String,Long> tm2=new TreeMap<String, Long>();
    ...       
    List<Map.Entry<String,Long>> first = new ArrayList<Map.Entry<String,Long>>(tm1.entrySet());
    List<Map.Entry<String,Long>> second = new ArrayList<Map.Entry<String,Long>>(tm2.entrySet());
    Iterable<Map.Entry<String,Long>> all = Iterables.mergeSorted(
            ImmutableList.of(first, second), new Ordering<Map.Entry<String, Long>>() {
        @Override
        public int compare(java.util.Map.Entry<String, Long> stringLongEntry, java.util.Map.Entry<String, Long> stringLongEntry2) {
            return stringLongEntry.getKey().compareTo(stringLongEntry2.getKey());
        }
    });
    TreeMap<String,Long> mappedMovies = Maps.uniqueIndex(... ??)

编辑:我无法使用Java 8,因为该程序在Amazon Web Services中仅支持Java 1.7的Haddoop程序中运行。

2 个答案:

答案 0 :(得分:3)

您可以使用Java 8 Streams来实现此目的。请考虑以下代码:

// Testdata - The first map
Map<String, Long> m1 = new TreeMap<>();
m1.put("A", 1L);
m1.put("B", 1L);
m1.put("C", 1L);

// Testdata - The second map
Map<String, Long> m2 = new TreeMap<>();
m2.put("C", 2L);
m2.put("D", 2L);
m2.put("E", 2L);

// Summarize using streams
final Map<String, Long> summarized =
        Stream.concat(m1.entrySet().stream(), m2.entrySet().stream())     // Stream both maps
              .collect(Collectors.groupingBy(                             // Collect the map
                          Map.Entry::getKey,                              // Group by key
                          Collectors.summingLong(Map.Entry::getValue)));  // Value is the sum

System.out.println("Summarized: " + summarized);                          // Print the output

汇总的Map按键分组,并在值上汇总。输出是:

  

总结:{A = 1,B = 1,C = 3,D = 2,E = 2}


如果你想把它放在一个函数中,只需这样做:

public Map<String, Long> summarize(
        final Map<String, Long> m1, 
        final Map<String, Long> m2) {

    return Stream.concat(m1.entrySet().stream(), m2.entrySet().stream())
                 .collect(groupingBy(
                          Map.Entry::getKey,
                          summingLong(Map.Entry::getValue)));
}

要阅读有关Java 8流的更多信息,请查看Oracle docs

答案 1 :(得分:1)

以下函数计算总和:

public static TreeMap<String, Long> sum(TreeMap<String, Long> first,
        TreeMap<String, Long> second) {
    TreeMap<String, Long> result = new TreeMap<String, Long>(first);

    for (Entry<String, Long> e : second.entrySet()) {
        Long l = result.get(e.getKey());
        result.put(e.getKey(), e.getValue() + (l == null ? 0 : l));
    }

    return result;
}

测试代码:

TreeMap<String, Long> first = new TreeMap<String, Long>();
TreeMap<String, Long> second = new TreeMap<String, Long>();

first.put("x", 1L);
first.put("y", 5L);

second.put("x", 2L);
second.put("y", 3L);
second.put("z", 5L);

System.out.println(sum(first, second));

输出:

{x=3, y=8, z=5}

修改

一个小的优化是复制最大的TreeMap并迭代最小的public static TreeMap<String, Long> sum(TreeMap<String, Long> first, TreeMap<String, Long> second) { // optimization (copy the largest tree map and iterate over the // smallest) if (first.size() < second.size()) { TreeMap<String, Long> t = first; first = second; second = t; } TreeMap<String, Long> result = new TreeMap<String, Long>(first); for (Entry<String, Long> e : second.entrySet()) { Long l = result.get(e.getKey()); result.put(e.getKey(), e.getValue() + (l == null ? 0 : l)); } return result; } 。这减少了查找/插入的次数。

{{1}}