how to use the first 3 values of a column in each chunk of dataframe to label each group

时间:2019-04-16 23:27:06

标签: pandas

I have a dataframe shown below

 df = pd.DataFrame({'type':[1]*15, 'key':['a','b','c','d','e'] * 3, 
 'value':['Tom', 'car','truck', 7,7, 'Steve','car','truck', 4,6, 
 'Jason', 'car', 'truck', 2, 10] })

Here is the Input.

    type    key         value       
     1      a           Tom      
     1      b           car      
     1      c           truck      
     1      d             7                
     1      e             7
     1      a           Steve
     1      b           car
     1      c           truck
     1      d             4
     1      e             6
     1      a           Jason
     1      b           car
     1      c           truck
     1      d             2
     1      e             10

I want to make change this dataframe like below:

   type       concatenated_first3_value     d     e
    1           Tom_car_truck               7     7
    1            Steve_car_truck            4     6
    1            Jason_car_truck            2     10

How should I do this?

1 个答案:

答案 0 :(得分:1)

I am using cumsum create another key for help pivot , here I am using unstack , you can also using pivot or pivot_table

df['New Key']=df.key.eq('a').cumsum()
#noted here I assuming you only have one type if not you need 
#df.key.eq('a').groupby(df['Type']).cumsum(),for the key creation 
s=df.set_index(['type','New Key','key'])['value'].unstack()
#s=df.pivot_table(index=['type','New Key'],columns='key',values='value',aggfunc='first')

s['New col']=s[list('abc')].apply('_'.join,1)    
s.drop(list('abc'),1,inplace=True)
s#You can add reset_index at the end 
key           d   e          New col
type New Key                      
1    1        7   7    Tom_car_truck
     2        4   6  Steve_car_truck
     3        2  10  Jason_car_truck