如何每5行在Pandas的列中对一个单元格进行排名?

时间:2019-03-11 14:14:08

标签: pandas

例如:

onPause

应该是:nan,nan,nan,nan,3,1,5,4,......

我尝试过: df = pd.DataFrame({'a': [10, 8, 4, 3, 5, 1, 21, 14, 19, 20, 7, 6, 0, 4, 3, 11]}) a 0 10 1 8 2 4 3 3 4 5 5 1 6 21 7 14 8 19 9 20 10 7 11 6 12 0 13 4 14 3 15 11

它没有给出预期的输出。有没有一种方法可以有效地处理大量行?

1 个答案:

答案 0 :(得分:2)

具有rankdata功能:

my.df <- structure(list(a = c(6.401462, 6.715845, 6.076211, 7.009623, 
5.002059, 6.298305, 4.856246, 5.03799, 4.903592, 5.500374), b = c(5.318849, 
4.786936, 5.356114, 5.275595, 6.163398, 3.291884, 4.674743, 4.129333, 
3.135622, 4.40013), c = c(5.373496, 3.521965, 5.605134, 4.801874, 
6.063694, 5.737053, 5.550828, 4.797334, 5.879798, 3.980433), 
    d = c(5.10114, 4.264029, 5.443002, 4.355892, 2.409702, 4.70132, 
    7.501786, 5.143915, 5.639893, 6.203259), e = c(3.710973, 
    4.525138, 5.296778, 6.752737, 6.172111, 4.752406, 5.466611, 
    5.558161, 4.368915, 4.498614)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

输出:

from scipy.stats import rankdata

df.rolling(5).apply(lambda x: rankdata(x)[-1])