熊猫分组并找到最大值和最小值之间的差异

时间:2020-12-20 11:32:27

标签: python pandas numpy

我有一个数据框。我汇总如下。但是,我想将它们区分为最大值 - 最小值

enter image description here

dnm=df.groupby('Type').agg({'Vehicle_Age': ['max','min']})

期待:

enter image description here

3 个答案:

答案 0 :(得分:5)

您可以使用 np.ptp,它会为您计算 max - min

df.groupby('Type').agg({'Vehicle_Age': np.ptp})

或者,

df.groupby('Type')['Vehicle_Age'].agg(np.ptp) 

如果您将系列作为输出。

答案 1 :(得分:3)

只是比较两者:

grouping = df.groupby('Type')
dnm = grouping.max() - grouping.min()

@cs95 的回答是正确的方法,也有更好的时机! :

设置:

df = pd.DataFrame({'a':np.arange(100),'Type':[1 if i %2 ==0 else 0 for i in range(100)]})

@cs95:

%timeit df.groupby('Type').agg({'a': np.ptp}) 

1.29 ms ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

对比

%%timeit  
grouping = df.groupby('Type') 
dnm = grouping.max() - grouping.min() 

1.57 ms ± 299 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

答案 2 :(得分:2)

您应该对表格的列执行基本的逐元素操作,您可以这样做:


import pandas as pd

# This is just setup to replicate your example
df = pd.DataFrame([[14, 7], [15, .25], [14, 9], [13, 2], [14, 4]], index=['Large SUV', 'Mid-size', 'Minivan', 'Small', 'Small SUV'], columns = ['max', 'min'])

print(df)

#             max   min
# Large SUV   14  7.00
# Mid-size    15  0.25
# Minivan     14  9.00
# Small       13  2.00
# Small SUV   14  4.00

# This is the operation that will give you the values you want
diff = df['max'] - df['min']

print(diff)

# Large SUV     7.00
# Mid-size     14.75
# Minivan       5.00
# Small        11.00
# Small SUV    10.00

相关问题