熊猫-更改数据框的对齐方式

时间:2018-10-09 15:40:30

标签: pandas

我创建了一个数据框:

In [1]: import pandas as pd 

In [2]: import numpy as np

In [3]: df = pd.DataFrame({ 'Student_ID':['001','002','003','004','005'],
                'Amy'   : ['Amy',np.nan,np.nan,np.nan,'Amy'],
                'Brian' : [np.nan,'Brian',np.nan,np.nan,np.nan],
                'Cat'   : [np.nan,np.nan,np.nan,'Cat',np.nan]},columns=['Student_ID','Amy','Brian','Cat']) 


In  [4]:df
Out [4]:
    Student_ID  Amy Brian   Cat
0          001  Amy   NaN   NaN
1          002  NaN Brian   NaN
2          003  NaN   NaN   NaN
3          004  NaN   NaN   Cat
4          005  Amy   NaN   NaN

接下来,我想返回到只有两列的数据框,Student_ID和Name。 如何以精确的代码转换为以下代码?

In [5]: df
Out[5]: 
  Student_ID    Name    
0        001     Amy    
1        002   Brian    
2        003     NaN    
3        004     Cat
4        005     Amy    

3 个答案:

答案 0 :(得分:4)

您可以使用dot

df.iloc[:,1:].notna().dot(df.columns[1:])
Out[78]: 
0      Amy
1    Brian
2         
3      Cat
4      Amy
dtype: object
#df['name']=df.iloc[:,1:].notna().dot(df.columns[1:])

bfill

df.iloc[:,1:].bfill(1).iloc[:,0]
Out[82]: 
0      Amy
1    Brian
2      NaN
3      Cat
4      Amy

答案 1 :(得分:4)

您可以使用groupby/first,因为first选择了每个组中的第一个非NaN项:

In [146]: df.set_index('Student_ID').unstack().groupby(level='Student_ID').first().rename('Name').reset_index()
Out[146]: 
  Student_ID   Name
0        001    Amy
1        002  Brian
2        003    NaN
3        004    Cat
4        005    Amy

答案 2 :(得分:2)

使用protected override void OnModelCreating(ModelBuilder modelBuilder) { modelBuilder.Entity<Questions>() .HasKey(que => que.que_guid); modelBuilder.Entity<Tags>() .HasKey(tag => tag.tag_guid); modelBuilder.Entity<TagQuestions>() .HasKey(tqu => new { tqu.tqu_que_guid, tqu.tqu_tag_guid } ); }

.lookup