使用java 8按任意间隔对Double进行分组以映射

时间:2016-07-14 15:17:18

标签: java functional-programming java-8

我有数据表示为正双号列表,以及包含将用于对数据进行分组的间隔的列表。间隔始终排序 我尝试使用以下实现对数据进行分组

    List<Double> data = DoubleStream.generate(new Random()::nextDouble).limit(10).map(d -> new Random().nextInt(30) * d).boxed().collect(Collectors.toList());
    HashMap<Integer, List<Double>> groupped = new HashMap<Integer, List<Double>>();
    data.stream().forEach(d -> {
        groupped.merge(getGroup(d, group), new ArrayList<Double>(Arrays.asList(d)), (l1, l2) -> {
            l1.addAll(l2);
            return l1;
        });
    });
    public static Integer getGroup(double data, List<Integer> group) {

    for (int i = 1; i < group.size(); i++) {
        if (group.get(i) > data) {
            return group.get(i - 1);
        }
    }
    return group.get(group.size() - 1);
}
    public static List<Integer> group() {
       List<Integer> groups = new LinkedList<Integer>();
       //can be arbitrary groupping
       groups.add(0);
       groups.add(6);
       groups.add(11);
       groups.add(16);
       groups.add(21);
       groups.add(26);
       return groups;
   }

是否可以通过收集器直接执行数据逻辑来执行这种灌浆/减少?

此外,考虑到过程的复杂性,这应该花费n ^ 2,因为我们迭代两个列表(或流)。现在它不是并行的,但我认为可以在paralel中执行getGroup()。应该使用任何Insight应该使用TreeSet还是List来获得更好的性能?

2 个答案:

答案 0 :(得分:5)

您可以对代码进行大量改进。 Random支持Stream API。因此,无需生成自己的DoubleStream 接下来,您应该只生成一次边界设置 最后有一个Collector::groupingBy可以帮助你。

import java.util.List;
import java.util.Map;
import java.util.NavigableSet;
import java.util.Random;
import java.util.TreeMap;
import java.util.TreeSet;
import java.util.stream.Collectors;

public class Test {

  public static void main(String... args) {
    Random r = new Random();
    List<Double> data = r.doubles(10).map(d -> r.nextInt(30) * d).peek(System.out::println).boxed()
        .collect(Collectors.toList());
    NavigableSet<Integer> groups = group();
    Map<Integer, List<Double>> groupped = data.stream()
        .collect(Collectors.groupingBy(d -> groups.floor(d.intValue()), TreeMap::new, Collectors.toList()));
    System.out.println(groupped);
  }

  public static NavigableSet<Integer> group() {
    NavigableSet<Integer> groups = new TreeSet<>();
    groups.add(0);
    groups.add(6);
    groups.add(11);
    groups.add(16);
    groups.add(21);
    groups.add(26);
    return groups;
  }
}

答案 1 :(得分:2)

使用TreeSet.ceiling()查找&#34;组&#34;边界值。

实施例

TreeSet<Double> groups = new TreeSet<>();
groups.add( 5d);                      // [-Inf,    5]
groups.add(10d);                      // ]   5,   10]
groups.add(15d);                      // ]  10,   15]
groups.add(20d);                      // ]  15,   20]
groups.add(25d);                      // ]  20,   25]
groups.add(30d);                      // ]  25,   30]
groups.add(Double.POSITIVE_INFINITY); // ]  30, +Inf]

Random rnd = new Random();
for (double value = 0d; value <= 30d; value += 5d) {
    double down = Math.nextDown(value);
    double up = Math.nextUp(value);
    System.out.printf("%-18s -> %4s     %-4s -> %4s     %-18s -> %4s%n",
                      down, groups.ceiling(down),
                      value, groups.ceiling(value),
                      up, groups.ceiling(up));
}
for (int i = 0; i < 10; i++) {
    double value = rnd.nextDouble() * 30d;
    double group = groups.ceiling(value);
    System.out.printf("%-18s -> %4s%n", value, group);
}

输出

-4.9E-324          ->  5.0     0.0  ->  5.0     4.9E-324           ->  5.0
4.999999999999999  ->  5.0     5.0  ->  5.0     5.000000000000001  -> 10.0
9.999999999999998  -> 10.0     10.0 -> 10.0     10.000000000000002 -> 15.0
14.999999999999998 -> 15.0     15.0 -> 15.0     15.000000000000002 -> 20.0
19.999999999999996 -> 20.0     20.0 -> 20.0     20.000000000000004 -> 25.0
24.999999999999996 -> 25.0     25.0 -> 25.0     25.000000000000004 -> 30.0
29.999999999999996 -> 30.0     30.0 -> 30.0     30.000000000000004 -> Infinity
3.7159199611763514 ->  5.0
7.685306184937567  -> 10.0
2.6949924484301633 ->  5.0
17.594251973883363 -> 20.0
24.005899441664994 -> 25.0
7.720531186142164  -> 10.0
22.82402791692674  -> 25.0
22.68288732263466  -> 25.0
13.056624829892243 -> 15.0
8.504511505971251  -> 10.0