如何拆分数字箱并找到箱的均值

时间:2019-08-15 13:57:28

标签: python-3.x pandas

我尝试从大数据框中提取单个ID,将价格范围计数和计算平均值进行装箱。无法获得从new_df获取价格范围以计算垃圾箱均值的方法,甚至试图拆分并堆叠价格范围,但仍无法访问价格范围。下面是我的代码。有人可以建议吗?

Sample data frame

Id          price    price_range                    
11111333    30.0    (0.0, 50.0]
11111333    34.0    (0.0, 50.0]
11111333    80.0    (50.0, 100.0]
11111333    25.0    (0.0, 50.0]
11111333    13.0    (0.0, 50.0]
11111333    17.0    (0.0, 50.0]
11111333    42.0    (0.0, 50.0]
11111333    20.0    (0.0, 50.0]
11111333    210.0   (200.0, 250.0]
22222111    30.0    (0.0, 50.0]
22222111    134.0   (100.0, 150.0]
22222111    1080.0  (1050.0, 1100.0]
22222111    25.0    (0.0, 50.0]
22222111    413.0   (400.0, 450.0]
22222111    117.0   (100.0, 150.0]
22222111    12.0    (0.0, 50.0]
22222111    60.0    (50.0, 100.0]
22222111    110.0   (100.0, 150.0]
#generate bin range
x_range=np.arange(0,df["Volume"].max()+50,50) 

#add new column price_range with values
df["price_range"]=pd.cut(df["Volume"],bins=x_range)

#get value counts of price 
new_df["range_cnt"]=pd.DataFrame(df["price_range"].value_counts())

new_df          
            range_cnt
(0.0, 50.0]     7
(50.0, 100.0]   1
(200.0, 250.0]  1

#split price range_cnt
out=new_df["range_cnt"].str.split(',\s+', expand=True).stack()

(0.0, 50.0]    0    7
(50.0, 100.0]  0    1
(200.0, 250.0] 0    1

dtype: object

#When i try to access first row,could get only 7,instead of (0.0, 50.0]
out[1]
0    7
dtype: object
Below is the expected format
Id          price_range         count   mean            
11111333    (0.0, 50.0]         7       25       
            (50.0, 100.0]       1       75
            (200.0, 250.0]      1       225

22222111    (0.0, 50.0]         3       25
            (50.0, 100.0]       1       75
            (100.0, 150.0]      3       125
            (400.0, 450.0]      1       425
            (1050.0, 1100.0]    1       1075

1 个答案:

答案 0 :(得分:1)

这是一种方法

new_df['mean']=new_df.index.map(lambda  x : (x.left+x.right)/2)
new_df
Out[121]: 
            price_range   mean
(100, 150]            2  125.0
(150, 200]            1  175.0
(50, 100]             1   75.0
(0, 50]               0   25.0