Question

我正在练习一些入门级的java 8 lambda功能。

给定一个消息列表，每个消息都包含一个消息偏移量，其中所有偏移量必须形成一个连续的整数列表，我试图找到间隙来警告。我觉得这一切都应该是一个很好的lambda。但我无法理解它。

所以，有这个工作片段：

private void warnAboutMessageGaps(final List<Message> messages) {

    final List<Long> offsets = messages.stream()
            .sorted(comparingLong(Message::getOffset))
            .map(Message::getOffset)
            .collect(toList())
            ;

    for (int i = 0; i < offsets.size() - 1; i++) {
        final long currentOffset = offsets.get(i);
        final long expectedNextOffset = offsets.get(i) + 1;
        final long actualNextOffset = offsets.get(i + 1);
        if (currentOffset != expectedNextOffset) {
            LOG.error("Missing offset(s) found in messages: missing from {} to {}", currentOffset + 1, actualNextOffset - 1);
        }
    }
}

我无法弄清楚如何制作它以便我可以在lambda中进行“与上一个/下一个对象比较”。任何指针都会受到赞赏。

/ edit：关于StreamEx和其他第三方解决方案的建议虽然受到赞赏，但并不是我想要的。

Answer 1

您可以使用pairMap方法StreamEx执行此操作：

StreamEx.of(messages)
        .sorted(Comparator.comparingLong(Message::getOffset))
        .pairMap((prev, next) -> new Message[] {prev, next})
        .forEach(prevNext -> {
            long currentOffset = prevNext[0].getOffset();
            long expectedNextOffset = prevNext[0].getOffset() + 1;
            long actualNextOffset = prevNext[1].getOffset();
            if (currentOffset != expectedNextOffset) {
                LOG.error(
                    "Missing offset(s) found in messages: missing from {} to {}",
                    currentOffset + 1, actualNextOffset - 1);
            }
        });

Answer 2

怎么样：

        List<Long> offsets = messages.stream()
                .sorted(comparingLong(Message::getOffset))
                .map(Message::getOffset)
                .collect(toList());

        IntStream.range(1, offsets.size())
                .mapToObj(i -> new Pair<>(offsets.get(i - 1), offsets.get(i)))
                .forEach(pair -> {
                    final long currentOffset = pair.getKey();
                    final long expectedNextOffset = pair.getKey() + 1;
                    final long actualNextOffset = pair.getValue();
                    if (actualNextOffset != expectedNextOffset) {
                        LOG.error("Missing offset(s) found in messages: missing from {} to {}", currentOffset + 1, actualNextOffset - 1);
                    }
                });

Answer 3

有时，尝试使用lambda表达式执行所有操作会使解决方案更加复杂。您可以使用：

messages.stream()
    .mapToLong(Message::getOffset)
    .sorted()
    .forEachOrdered(new LongConsumer() {
        boolean first=true;
        long expected;
        public void accept(long value) {
            if(first) first=false;
            else if(value!=expected)
                LOG.error("Missing offset(s) found in messages: missing from {} to {}",
                          expected, value);
            expected=value+1;
        }
    });

但请注意，无论流链的流畅程度如何，sorted()都是一个有状态的中间操作，它在幕后创建并使用后备阵列。如果您明确使用该数组，那么您不会失去任何东西：

long[] l = messages.stream().mapToLong(Message::getOffset).toArray();
Arrays.sort(l);
for(int ix=1; ix<l.length; ix++) {
    long value = l[ix], expected = l[ix-1]+1;
    if(value!=expected)
        LOG.error("Missing offset(s) found in messages: missing from {} to {}",
                  expected, value);
}

很难找到更简单的解决方案。但是如果你想减少所需的内存量，你可以使用BitSet代替数组：

OptionalLong optMin = messages.stream().mapToLong(Message::getOffset).min();
if(!optMin.isPresent()) return;
long min = optMin.getAsLong();
BitSet bset = messages.stream()
    .mapToLong(Message::getOffset)
    .collect(BitSet::new, (bs,l) -> bs.set((int)(l-min)), BitSet::or);
for(int set=0, clear; set>=0; ) {
    clear = bset.nextClearBit(set);
    set = bset.nextSetBit(clear);
    if(set >= 0)
        LOG.error("Missing offset(s) found in messages: missing from {} to {}",
                  min+clear, min+set);
}

在与偏移的值范围相比没有间隙或相当小的间隙的情况下，这将显着减少使用的存储器。当最小偏移量与最大偏移量之间的距离大于Integer.MAX_VALUE时，它会失败。

您可以事先检查一下，如果根本没有间隙，也可以提供快捷方式：

LongSummaryStatistics stat = messages.stream()
    .mapToLong(Message::getOffset).summaryStatistics();
if(stat.getCount()==0 ||
   // all solutions assume that there are no duplicates, in this case,
   // the following test allows to prove that there are no gaps:
   stat.getMax()-stat.getMin()==messages.size()-1) {
    return;
}

if(stat.getMax()-stat.getMin()>Integer.MAX_VALUE) {
    // proceed with array based test
    …
}
else {
    long min = stat.getMin();
    // proceed with BitSet based test
    …

Answer 4

为了学习Java 8 api，您可以使用收集器，您可以依次比较流的每个成员，并使用累加器类BadPairs来跟踪序列中的任何间隙抵消。

为了帮助您理解供应商，累加器和组合器lambda之间的关系，我写这篇文章比它需要的更详细。

public class PairedStreamTest {

    private BiConsumer<BadPairs,BadPairs> combiner = (bad1,bad2) -> bad1.add(bad2);

    private Supplier<BadPairs> supplier = BadPairs::new;

    private BiConsumer<BadPairs,Message> accumulator = (bad,msg) -> bad.add(msg);

    @Test
    public void returnsTwoBadPairs_givenInputStreamIsMissingOffsets_forFourAndSix() throws Exception {

        BadPairs badPairs = Stream.of(new Message(1), new Message(2), new Message(3), new Message(5), new Message(7))
                .sorted(comparingLong(Message::getOffset))
                .collect(supplier, accumulator, combiner);

        badPairs.pairs.forEach(pair ->
                LOG.error("Missing offset(s) found in messages: missing from {} to {}", pair.first.offset, pair.second.offset));

        assertTrue(badPairs.pairs.size() == 2);
    }

    // supporting classes for the above test code

    private final Logger LOG = LoggerFactory.getLogger(PairedStreamTest.class);

    class Message {
        public int offset;
        public Message(int i) {
            this.offset = i;
        }
        public Integer getOffset() {
            return this.offset;
        }
    }

    class Pair {
        private Message first;
        private Message second;
        public Pair(Message smaller, Message larger) {
            this.first = smaller;
            this.second = larger;
        }
    }

    class BadPairs {
        public Message previous;
        public Set<Pair> pairs = new HashSet<>();
        public void add(BadPairs other) {
            this.pairs.addAll(other.pairs);
        }
        public void add(Message msg) {
            if(previous != null && previous.offset != msg.offset-1) {
                this.pairs.add(new Pair(previous, msg));
            }
            this.previous = msg;
        }
    }
}

请原谅公共成员变量的不当使用以及此Test类的布局。我的目的是最初将读者关注@Test案例，而不是支持类。

Answer 5

怎么样：

final List<Long> offsets = messages.stream().map(Message::getOffset).sorted().collect(toList());
IntStream.range(0, offsets.size() - 1).forEach(i -> {
    long currentOffset = offsets.get(i);
    if (offsets.get(i + 1) != currentOffset + 1) {
        LOG.error("Missing offset(s) found in messages: missing from {} to {}", currentOffset + 1, offsets.get(i + 1) - 1);
    }
});

或StreamEx的所有声明：

StreamEx.of(messages).mapToLong(Message::getOffset).sorted().boxed()
          .pairMap((i, j) -> new long[] { i, j }).filter(a -> a[1] - a[0] > 1)
          .forEach(a -> LOG.error("Missing offset(s) found in messages: missing from {} to {}", a[0] + 1, a[1] - 1));

或AbacusUtil的所有声明：

Stream.of(messages).mapToLong(Message::getOffset).sorted()
          .sliding0(2).filter(e -> e.size() == 2 && e.get(1) - e.get(0) > 1)
          .forEach(e -> LOG.error("Missing offset(s) found in messages: missing from {} to {}", e.get(0) + 1, e.get(1) - 1));

Answer 6

对于当前问题，这种方法似乎更合适

messages.stream().sorted( Comparator.comparingLong( Message::getOffset ) )
  .reduce( (m1, m2) -> {
    if( m1.getOffset() + 1 != m2.getOffset() )
      LOG.error( "Missing offset(s) found in messages: missing from {} to {}", m1.getOffset(), m2.getOffset() );
    return( m2 );
  } );

此解决方案使用reduce而不是其预期用途。它仅使用reduce的能力遍历流中的所有对。
reduce的结果未未使用。（不可能进一步使用结果，因为这将需要可变的减少。）

Java 8 lambda：迭代流对象并在流中使用上一个/下一个对象

6 个答案: