在给定数据的情况下,我想要实现的目标,例如 :
time, part, data
0, a, 3
1, a, 4
2, b, 10
3, b, 20
3, a, 5
和转换:
stream.keyBy(_.part).scan(0)((s, d) => s + d)
获取:
0, a, 3
1, a, 7
2, b, 10
3, b, 30
3, a, 12
我尝试使用groupAdjacentBy
对它进行分区,但是它变得太复杂了,因为我需要使用Key保留每个Chunk之间的复杂状态。
我想知道是否有与Flink DataStream类似的内容。keyBy?或更简单的实现方式?
答案 0 :(得分:0)
好的,我发现有趣的solution(虽然不能为flatten
)
答案 1 :(得分:0)
如上所述,可以通过对扫描操作本身进行“分区”来解决该问题:
# Create a custom object...
$customObject = [pscustomobject] @{ one = 1; two = 2 }
# which is an instance of a reference type.
$customObject.GetType().IsValueType # -> $false
# Create an array comprising a value-type instance (1)
# and a reference-type instance (the custom object).
$arr = 1, $customObject
# Look for the value-type instance.
$objectToLookFor = 1
$arr -contains $objectToLookFor # value equality -> $true
# Create another custom object, with the same properties as above.
$objectToLookFor = [pscustomobject] @{ one = 1; two = 2 }
# This lookup *fails*, because $objectToLookFor, despite having the same
# properties as the custom object stored in the array, is a *different object*
$arr -contains $objectToLookFor # reference equality -> $false(!)
# If we look for the very same object stored in the array, the lookup
# succeeds.
$arr -contains $customObject # -> $true
输出:
Element(0,'a,3)
Element(1,'a,7)
Element(2,'b,10)
Element(3,'b,30)
Element(3,'a,12)
答案 2 :(得分:0)
我做了这样的事情。首先拆分,然后合并。我还不知道如何返回2个流。我只知道如何在一处处理它们,然后将它们合并在一起。
var size = req.query.size