例如:
onPause
应该是:nan,nan,nan,nan,3,1,5,4,......
我尝试过:
df = pd.DataFrame({'a': [10, 8, 4, 3, 5, 1, 21, 14, 19, 20, 7, 6, 0, 4, 3, 11]})
a
0 10
1 8
2 4
3 3
4 5
5 1
6 21
7 14
8 19
9 20
10 7
11 6
12 0
13 4
14 3
15 11
它没有给出预期的输出。有没有一种方法可以有效地处理大量行?
答案 0 :(得分:2)
具有rankdata
功能:
my.df <- structure(list(a = c(6.401462, 6.715845, 6.076211, 7.009623,
5.002059, 6.298305, 4.856246, 5.03799, 4.903592, 5.500374), b = c(5.318849,
4.786936, 5.356114, 5.275595, 6.163398, 3.291884, 4.674743, 4.129333,
3.135622, 4.40013), c = c(5.373496, 3.521965, 5.605134, 4.801874,
6.063694, 5.737053, 5.550828, 4.797334, 5.879798, 3.980433),
d = c(5.10114, 4.264029, 5.443002, 4.355892, 2.409702, 4.70132,
7.501786, 5.143915, 5.639893, 6.203259), e = c(3.710973,
4.525138, 5.296778, 6.752737, 6.172111, 4.752406, 5.466611,
5.558161, 4.368915, 4.498614)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
输出:
from scipy.stats import rankdata
df.rolling(5).apply(lambda x: rankdata(x)[-1])