Java 8 Streams不允许重用。这创建了一个关于如何在创建滑动窗口通量时重用流来计算像x(i)* x(i-1)之类的关系的难题。
以下代码基于移位运算符的思想。我用skip(1)移动第一个流来创建第二个流。
Flux<Integer> primary = Flux.fromStream(IntStream.range(1, 10).boxed());
Flux<Integer> secondary = primary.skip(1);
primary.zipWith(secondary)
.map(t -> t.getT1() * t.getT2())
.subscribe(System.out::println);
以下是上述代码的直观表示:
1 2 3 4 5 6 7 8 9 10
v v v v v v v v v v skip(1)
2 3 4 5 6 7 8 9 10
v v v v v v v v v v zipWith
1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9, 9 10 <- sliding window of length 2
v v v v v v v v v v multiples
2 6 12 20 30 42 56 72 90
不幸的是,此代码错误为:
java.lang.IllegalStateException: stream has already been operated upon or closed
显而易见的解决方法是缓存元素并确保缓存大小大于或等于流大小:
Flux<Integer> primary = Flux.fromStream(IntStream.range(1, 10).boxed()).cache(10);
或使用流替换:
Flux<Integer> primary = Flux.range(0, 10);
第二个解决方案只是重新执行skip(1)序列的原始序列。
然而,有效的解决方案只需要一个大小为2的缓冲区。如果流恰好是一个大文件,这是一个大问题:
Files.lines(Paths.get(megaFile));
如何有效地缓冲流,因此对主要Flux的多次订阅不会导致所有内容被读入内存或导致重新执行?
答案 0 :(得分:3)
我终于发现了一个解决方案,尽管它不是面向缓冲区的。灵感是首先解决2的滑动窗口的问题:
Flux<Integer> primary = Flux.fromStream(IntStream.range(0, 10).boxed());
primary.flatMap(num -> Flux.just(num, num))
.skip(1)
.buffer(2)
.filter(list -> list.size() == 2)
.map(list -> Arrays.toString(list.toArray()))
.subscribe(System.out::println);
该过程的直观表示如下:
1 2 3 4 5 6 7 8 9
V V V V V V V V V Flux.just(num, num)
1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
V V V V V V V V V skip(1)
1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9
V V V V V V V V V bufffer(2)
1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9, 9
V V V V V V V V V filter
1 2, 2 3, 3 4, 4 5, 5 6, 6 7, 7 8, 8 9
这是输出:
[0, 1]
[1, 2]
[2, 3]
[3, 4]
[4, 5]
[5, 6]
[6, 7]
[7, 8]
[8, 9]
然后我概括了上述想法,为任意滑动窗口大小创建解决方案:
public class SlidingWindow {
public static void main(String[] args) {
System.out.println("Different sliding windows for sequence 0 to 9:");
SlidingWindow flux = new SlidingWindow();
for (int windowSize = 1; windowSize < 5; windowSize++) {
flux.slidingWindow(windowSize, IntStream.range(0, 10).boxed())
.map(SlidingWindow::listToString)
.subscribe(System.out::print);
System.out.println();
}
//show stream difference: x(i)-x(i-1)
List<Integer> sequence = Arrays.asList(new Integer[]{10, 12, 11, 9, 13, 17, 21});
System.out.println("Show difference 'x(i)-x(i-1)' for " + listToString(sequence));
flux.slidingWindow(2, sequence.stream())
.doOnNext(SlidingWindow::printlist)
.map(list -> list.get(1) - list.get(0))
.subscribe(System.out::println);
System.out.println();
}
public <T> Flux<List<T>> slidingWindow(int windowSize, Stream<T> stream) {
if (windowSize > 0) {
Flux<List<T>> flux = Flux.fromStream(stream).map(ele -> Arrays.asList(ele));
for (int i = 1; i < windowSize; i++) {
flux = addDepth(flux);
}
return flux;
} else {
return Flux.empty();
}
}
protected <T> Flux<List<T>> addDepth(Flux<List<T>> flux) {
return flux.flatMap(list -> Flux.just(list, list))
.skip(1)
.buffer(2)
.filter(list -> list.size() == 2)
.map(list -> flatten(list));
}
protected <T> List<T> flatten(List<List<T>> list) {
LinkedList<T> newl = new LinkedList<>(list.get(1));
newl.addFirst(list.get(0).get(0));
return newl;
}
static String listToString(List list) {
return list.stream()
.map(i -> i.toString())
.collect(Collectors.joining(", ", "[ ", " ], "))
.toString();
}
static void printlist(List list) {
System.out.print(listToString(list));
}
}
以上代码的输出如下:
Different sliding windows for sequence 0 to 9:
[ 0 ], [ 1 ], [ 2 ], [ 3 ], [ 4 ], [ 5 ], [ 6 ], [ 7 ], [ 8 ], [ 9 ],
[ 0, 1 ], [ 1, 2 ], [ 2, 3 ], [ 3, 4 ], [ 4, 5 ], [ 5, 6 ], [ 6, 7 ], [ 7, 8 ], [ 8, 9 ],
[ 0, 1, 2 ], [ 1, 2, 3 ], [ 2, 3, 4 ], [ 3, 4, 5 ], [ 4, 5, 6 ], [ 5, 6, 7 ], [ 6, 7, 8 ], [ 7, 8, 9 ],
[ 0, 1, 2, 3 ], [ 1, 2, 3, 4 ], [ 2, 3, 4, 5 ], [ 3, 4, 5, 6 ], [ 4, 5, 6, 7 ], [ 5, 6, 7, 8 ], [ 6, 7, 8, 9 ],
Show difference 'x(i)-x(i-1)' for [ 10, 12, 11, 9, 13, 17, 21 ],
[ 10, 12 ], 2
[ 12, 11 ], -1
[ 11, 9 ], -2
[ 9, 13 ], 4
[ 13, 17 ], 4
[ 17, 21 ], 4
答案 1 :(得分:0)
我已经实现了以下解决方案:
public <T> Flux<Flux<T>> toSlidingWindow(Flux<T> source, int size) {
return toSlidingWindow(source, deque -> {
while (deque.size() > size) {
deque.poll();
}
return Flux.fromIterable(deque);
});
}
public <T> Flux<Flux<T>> toSlidingWindow(Flux<T> source, Function<Deque<T>, Flux<T>> dequePruneFunction) {
return source.map(ohlc -> {
Deque<T> deque = dequeAtomicReference.get();
deque.offer(ohlc);
return dequePruneFunction.apply(deque);
});
}
这可以是固定大小的滑动窗口,也可以使用自定义函数确定每个窗口的范围。
如果使用这种方法出现任何多线程问题,则可以在Deque
支持的acquire
和release
块内复制AtomicReference
。这样可以确保结果窗口Flux
被其他线程保持不变。
也许像这样:
public <T> Flux<Flux<T>> toSlidingWindowAsync(Flux<T> source, int size) {
return toSlidingWindowAsync(source, deque -> {
while (deque.size() > size) {
deque.poll();
}
return Flux.fromIterable(new LinkedList<>(deque));
});
}
public <T> Flux<Flux<T>> toSlidingWindowAsync(Flux<T> source, Function<Deque<T>, Flux<T>> dequePruneFunction) {
AtomicReference<Deque<T>> dequeAtomicReference = new AtomicReference<>(new LinkedList<>());
return source.map(ohlc -> {
Deque<T> deque = dequeAtomicReference.getAcquire();
deque.offer(ohlc);
Flux<T> windowFlux = dequePruneFunction.apply(deque);
dequeAtomicReference.setRelease(deque);
return windowFlux;
});
}
这将复制用于每个结果滑动窗口的Deque
。
答案 2 :(得分:0)
如果您使用的是Reactor Core 3(不确定该运算符何时发布),则可以简单地使用
Flux.fromStream(IntStream.rangeClosed(1, 10).boxed())
.buffer(2, 1)
.skipLast(1)
.map(t -> t.stream().reduce((a, b)-> a*b))
.subscribe(System.out::println);
魔术是buffer(2,1)部分:此处maxSize为2,skip为1。由于maxSize大于skip,这会在通量上创建重叠的缓冲区(即滑动窗口),并发出每个缓冲区缓冲区作为列表。 需要skipLast(1),因为最后一个缓冲区将是单个元素(共10个),因此需要跳过。