当列的值相同时,如何获得行总和?

时间:2018-09-24 07:54:06

标签: python-2.7 pandas sum grouping pandas-groupby

我有一个像这样的数据集:

 time(secs) setup
     40     setup1
     30     setup1       
     20     setup1
     10     setup2
     20     setup2 
     10     setup1
     30     setup1
     30     setup2
     40     setup2
     10     setup3
     20     setup3

我想根据类似的pandas dataframe值来获取setup中的行总和:

  time(secs)  setup
    90        setup1
    30        setup2
    40        setup1
    70        setup2
    30        setup3

但是通过使用groupby()函数:

  df.groupby(['setup']).sum()

我得到的结果是:

  setup      time 

  setup1      130 
  setup2      100
  setup3       30

请帮助解决此问题...

谢谢!!!

1 个答案:

答案 0 :(得分:1)

分组并通过助手sumcumsumshiftSeries.ne first比较Series和助手(!=)与助手df1 = (df.groupby(df['setup'].ne(df['setup'].shift()).cumsum(), as_index=False) .agg({'time(secs)':'sum', 'setup':'first'})) print (df1) time(secs) setup 0 90 setup1 1 30 setup2 2 40 setup1 3 70 setup2 4 30 setup3

print (df['setup'].ne(df['setup'].shift()).cumsum())
0     1
1     1
2     1
3     2
4     2
5     3
6     3
7     4
8     4
9     5
10    5
Name: setup, dtype: int32

详细信息:

df['groups'] = df['setup'].ne(df['setup'].shift()).cumsum()
print (df)
    time(secs)   setup  groups
0           40  setup1       1
1           30  setup1       1
2           20  setup1       1
3           10  setup2       2
4           20  setup2       2
5           10  setup1       3
6           30  setup1       3
7           30  setup2       4
8           40  setup2       4
9           10  setup3       5
10          20  setup3       5

df1 = (df.groupby('groups')
         .agg({'time(secs)':'sum', 'setup':'first'})
         .reset_index(drop=True))

与新列相似的解决方案:

df1 = (df.groupby(['groups', 'setup'])['time(secs)'].sum()
         .reset_index(level=0, drop=True)
         .reset_index())

print (df1)
   time(secs)   setup
0          90  setup1
1          30  setup2
2          40  setup1
3          70  setup2
4          30  setup3

let group = [
    {
        id: 1,
        name: 'Test 1',
        geo: 'Japan',
        car: 'Toyota'
    },
    {
        id: 2,
        name: 'Test 2',
        geo: 'USA',
        car: 'Tesla'
    },
    {
        id: 3,
        name: 'Test 3',
        geo: 'Germany',
        car: 'Audi'
    }
];