我正在转换Python函数以进行雨水计数
def extract_cycles(series):
"""
Returns two lists: the first one containig full cycles and the second
containing one-half cycles. The cycles are extracted from the iterable
*series* according to section 5.4.4 in ASTM E1049 (2011).
"""
points = deque()
full, half = [], []
for x in reversals(series):
points.append(x)
while len(points) >= 3:
# Form ranges X and Y from the three most recent points
X = abs(points[-2] - points[-1])
Y = abs(points[-3] - points[-2])
if X < Y:
# Read the next point
break
elif len(points) == 3:
# Y contains the starting point
# Count Y as one-half cycle and discard the first point
half.append(Y)
points.popleft()
else:
# Count Y as one cycle and discard the peak and the valley of Y
full.append(Y)
last = points.pop()
points.pop()
points.pop()
points.append(last)
else:
# Count the remaining ranges as one-half cycles
while len(points) > 1:
half.append(abs(points[-2] - points[-1]))
points.pop()
return full, half
但是,我一直在努力以Spark方式进行这项工作-最初,我认为使用window
是可行的,但没有办法保持下一行可以引用的运行总计。< / p>
我应该研究另一种方法吗?似乎遍历行是我唯一的方法,但这违反了Spark的目的。