如何从同一列

时间:2017-05-12 09:51:39

标签: python pandas dataframe

我想连接该列中同一列的两个值,这是我的csv文件:

Date,Region,TemperatureMax,TemperatureMin,PrecipitationMax,PrecipitationMin
01/01/2016,Champagne Ardenne,12,6,2.5,0.3
02/01/2016,Champagne Ardenne,13,9,3.9,0.6
03/01/2016,Champagne Ardenne,14,7,22.5,12.5
01/01/2016,Bourgogne,9,5,0.1,0
02/01/2016,Bourgogne,11,8,16.3,4.2
03/01/2016,Bourgogne,10,5,12.2,6.3
01/01/2016,Pays de la Loire,12,6,2.5,0.3
02/01/2016,Pays de la Loire,13,9,3.9,0.6
03/01/2016,Pays de la Loire,14,7,22.5,12.5

我希望 Bourgogne Champagne Ardenne 而不是将它们分开并计算 TemperatureMax TemperatureMin PrecipitationMax <的平均值/ strong>, PrecipitationMin

01/01/2016,Bourgogne Champagne Ardenne,10.5,5.5,1.3,0.15
02/01/2016,Bourgogne Champagne Ardenne,12,8.5,10.1,2.4
03/01/2016,Bourgogne Champagne Ardenne,12,6,17.35,9.4
01/01/2016,Pays de la Loire,12,6,2.5,0.3
02/01/2016,Pays de la Loire,13,9,3.9,0.6
03/01/2016,Pays de la Loire,14,7,22.5,12.5

2 个答案:

答案 0 :(得分:1)

使用groupby的agg方法:

df.groupby('Date').agg({
    'Region': lambda g: g.sort_values().str.cat(sep=' '),
    'TemperatureMax': 'mean',
    'TemperatureMin': 'mean',
    'PrecipitationMax': 'mean',
    'PrecipitationMin': 'mean'
})

请注意,这会按字母顺序连接区域。

答案 1 :(得分:1)

更通用的解决方案首先是dict d = {'Champagne Ardenne':'Bourgogne Champagne Ardenne', 'Bourgogne':'Bourgogne Champagne Ardenne'} df['Region'] = df['Region'].replace(d) df1 = df.groupby(['Date', 'Region'], as_index=False, sort=False).mean() print (df1) Date Region TemperatureMax TemperatureMin \ 0 01/01/2016 Bourgogne Champagne Ardenne 10.5 5.5 1 02/01/2016 Bourgogne Champagne Ardenne 12.0 8.5 2 03/01/2016 Bourgogne Champagne Ardenne 12.0 6.0 3 01/01/2016 Pays de la Loire 12.0 6.0 4 02/01/2016 Pays de la Loire 13.0 9.0 5 03/01/2016 Pays de la Loire 14.0 7.0 PrecipitationMax PrecipitationMin 0 1.30 0.15 1 10.10 2.40 2 17.35 9.40 3 2.50 0.30 4 3.90 0.60 5 22.50 12.50 ,然后是replace +汇总groupby

var find = function(arr, name) {
    for (var i = 0; i < arr.length; i++) {
        for (var j = 0; j < arr[i].categories.length; j++) {
            if (arr[i].categories[j].name === name) {
                return arr[i].categories[j];
            }
        }
    }
}

find(arr, 'Kids')