Pandas:联合数据帧中的重复字符串

时间:2017-04-19 09:44:37

标签: python pandas

我有数据框。这是他们的一部分

int[] array = {1,3,5,2,9,7,0};
Arrays.sort(array);

欲望输出

       member_id          event_time                       event_path    event_duration  \
0        2333678 2016-12-27 04:17:16  youtube.com/watch?v=w5ZIb05NO58    12  
1        2333678 2016-12-27 04:17:26  youtube.com/watch?v=w5ZIb05NO58     12 
2        2333678 2016-12-27 04:17:36  youtube.com/watch?v=w5ZIb05NO58   10   
3        2333678 2016-12-27 04:17:40  youtube.com/watch?v=w5ZIb05NO58   35   
4        5611206 2016-12-30 17:16:01  youtube.com/watch?v=qZrQWA5IsKA   35   
5        5611206 2016-12-30 17:16:10  youtube.com/watch?v=qZrQWA5IsKA    12  
6        5611206 2016-12-30 17:16:27  youtube.com/watch?v=6YM5UhnElcE   10   
7        5611206 2016-12-30 17:16:37  youtube.com/watch?v=6YM5UhnElcE   10   
8        5611206 2016-12-30 17:16:47  youtube.com/watch?v=6YM5UhnElcE   10

我用

       member_id          event_time                       event_path   event_duration
0        2333678 2016-12-27 04:17:16  youtube.com/watch?v=w5ZIb05NO58    69     
4        5611206 2016-12-30 17:16:01  youtube.com/watch?v=qZrQWA5IsKA    47    
6        5611206 2016-12-30 17:16:27  youtube.com/watch?v=6YM5UhnElcE    30      

但它不会连接所有字符串。

2 个答案:

答案 0 :(得分:1)

如果您想为event_time中的每个群组设置第一个项目,您可以使用以下内容(您还将其用于event_path):

>>> df.groupby([df.member_id, df.event_path]).agg({'event_duration':'sum', 'event_time': 'first'}).reset_index().reindex(columns=df.columns)

    member_id event_time                       event_path  event_duration
0  2016-12-27   04:17:16  youtube.com/watch?v=w5ZIb05NO58              69
1  2016-12-30   17:16:27  youtube.com/watch?v=6YM5UhnElcE              30
2  2016-12-30   17:16:01  youtube.com/watch?v=qZrQWA5IsKA              47

答案 1 :(得分:1)

    df.groupby(['member_id','event_path']).agg({'event_time':'min','event_duration':'sum'}).reset_index()

输出:

  member_id                       event_path           event_time  \
0   2333678  youtube.com/watch?v=w5ZIb05NO58  2016-12-27 04:17:16   
1   5611206  youtube.com/watch?v=6YM5UhnElcE  2016-12-30 17:16:27   
2   5611206  youtube.com/watch?v=qZrQWA5IsKA  2016-12-30 17:16:01   

   event_duration  
0              69  
1              30  
2              47