Question

我希望找到一个值的排名和百分比排名，与pandas列相比较。该值不在pandas列中，并且由于我正在寻找更快的处理方式，因此沿着那条大道走下去可能不是一个好主意。这是我的代码：

sqla  = ("SELECT col FROM table") #25000 rows in each table
df    = psql.read_sql(sqla, conn)

#df as np array
arr   = np.array(df['col'])

#number to rank
rank_this = 1.23456

#insert the value into arr
arr_insert = np.insert(arr,1,rank_this,axis=None)

#sort arr_insert
sorted_array = np.sort(arr_insert)

#find the position of 'rank_this' in 'sorted_array'
position = float([i for i,k in enumerate(sorted_array) if k == rank_this])

#find percentage rank of that value compared to list
foo = (1-(1-round(position/len(sorted_array),6)))*100

正如你可能会想的那样，它很慢：它也需要运行几百万次，我不会进入Cython领域或任何太耗时的东西，但绝对会有更多的熊猫。感谢您提前提出的任何建议。

将一个值与Pandas列

0 个答案: