Question

我想按特定范围（0,1; 0,2; etc）对我的值（CPA％）进行分组。现在，我的代码如下：

 conn = psycopg2.connect("dbname=monty user=postgres host=localhost password=postgres")
cur = conn.cursor()
cur.execute("SELECT * FROM binance.zrxeth_ob_indicators;")
row = cur.fetchall()
df = pd.DataFrame(row,columns=['timestamp', 'topAsk', 'topBid', 'CPA', 'midprice', 'CPB', 'spread', 'CPA%', 'CPB%'])
pd.cut(df,0.001)

我的输出如下：

如何将这些值按特定范围分组并计数？我是熊猫图书馆的新手，不正确地了解如何使用它...

Answer 1

您不需要cut，将//与value_counts一起使用

(df['CPA%']//0.001).value_counts()
Out[628]: 
13.0    2
16.0    2
22.0    1
8.0     1
7.0     1
5.0     1
Name: CPA%, dtype: int64

让我们尝试其他选择

import numpy as np 

np.floor(df['CPA%']*1000).value_counts()
Out[637]: 
13.0    2
16.0    2
22.0    1
8.0     1
7.0     1
5.0     1
Name: CPA%, dtype: int64
-

Answer 2

我不确定这是否能回答您的问题，但是一种快速的解决方案是在熊猫中创建范围组的新列。像这样：

df.loc[:,'range_group'] = np.where(df.CPA >0.75, 1, np.where(df.CPA > 0.5, 2, np.where(df.CPA> 0.25, 3, 4)))

然后您进行分组，例如，对每个范围组的行进行计数：

df.groupby('range_group').CPA.count()

只需更改，将count（）更改为所需的任何函数即可。这是你想要的吗？

根据下面的评论，您似乎需要这样做：

steps = [0,0.001, 0.002, 0.003, ....,1]
df.groupby(pd.cut(df.CPA, steps)).count()

熊猫：计算值的范围是0.001，因此，计数范围是0到0.001，然后计数范围是0.001和0.002等

2 个答案: