我希望能够在数据框中传递包含列名称的列表的名称,并在通过不同的汇总函数对每个集合进行分组之后应用。
以下是一个幼稚而失败的尝试:
import pandas as pd
import seaborn as sns
mpg= sns.load_dataset('mpg')
variables_to_mean = ['cylinders', 'displacement']
variables_to_median = ['weight', 'horsepower']
mpg.groupby(['model_year', 'origin']).agg({ variables_to_mean : 'mean', variables_to_median : 'median'})
TypeError: unhashable type: 'list'
我如何实现我的目标?
答案 0 :(得分:2)
通过dict.fromkeys
和merge一起创建字典:
variables_to_mean = ['cylinders', 'displacement']
variables_to_median = ['weight', 'horsepower']
d = {**dict.fromkeys(variables_to_mean, 'mean'),**dict.fromkeys(variables_to_median, 'median')}
print (d)
{'cylinders': 'mean', 'displacement': 'mean', 'weight': 'median', 'horsepower': 'median'}
df = mpg.groupby(['model_year', 'origin']).agg(d)
print (df.head())
cylinders displacement weight horsepower
model_year origin
70 europe 4.000000 107.800000 2375.0 90.0
japan 4.000000 105.000000 2251.0 91.5
usa 7.636364 336.909091 3651.0 167.5
71 europe 4.000000 95.000000 2069.5 73.0
japan 4.000000 88.250000 1951.5 78.5