给定一个pandas系列a
,对于每个值a[i]
,我需要计算a[i-window:i-1]
中有多少值大于a[i]
下面的代码通过python for循环完成工作,这在严肃的计算任务上很慢
Pandas是否提供类似的功能,可能包含一些优化的Numpy功能?
import numpy as np
import pandas
window = 30 # any arbitrary window
a = pandas.Series(np.random.rand(100)) # dummy variable, arbitrary length
counter = pandas.Series(data=np.NaN, index=a.index)
for i in a.index[window:]:
counter[i] = (a[i-window:i-1] < a[i]).sum()
print counter
答案 0 :(得分:3)
您可以使用pd.rolling_apply
import numpy as np
import pandas as pd
window = 30
df = pd.DataFrame(np.random.randn(100), columns=['Data'])
counts = pd.rolling_apply(df, window+1, lambda s: s[s < s[-1]].shape[0])
确保在窗口大小中添加一个。