我有以下数据框:
df=pd.DataFrame({'seq':[0,1,2,3,4,5], 'location':['cal','cal','cal','il','il','il'],'lat':[29,29.1,28.2,15.2,15.6,14], 'lon':[-95,-98,-95.6,-88, -87.5,-88.9], 'name': ['mike', 'john', 'tyler', 'rob', 'ashley', 'john']})
我想知道是否有办法在数据帧的开头插入新行,即使新行中可能缺少某些字段。
我搜索了SO,并找到了相关链接。 add a row at top in pandas dataframe
但是,我的情况有所不同,因为我没有要插入的新行中所有字段的值。以下链接解决了相同的问题,但在R中: Inserting rows into data frame when values missing in category
如何在上面的df中插入以下行? {'location':'仓库','lat':22,'lon':-50}
我想要的输出如下:
seq location lat lon name
0 warehouse 25.0 -50.0
1 0.0 cal 29.0 -95.0 mike
2 1.0 cal 29.1 -98.0 john
3 2.0 cal 28.2 -95.6 tyler
4 3.0 il 15.2 -88.0 rob
5 4.0 il 15.6 -87.5 ashley
6 5.0 il 14.0 -88.9 john
我的实际数据框的列数非常大。因此,为每列插入一个np.nan是不可行的。寻找一种仅指定字段和关联值的方法,其余字段填充有nans。
答案 0 :(得分:3)
尝试一下:
import pandas as pd
import numpy as np
df=pd.DataFrame({'seq':[0,1,2,3,4,5], 'location':['cal','cal','cal','il','il','il'],'lat':[29,29.1,28.2,15.2,15.6,14], 'lon':[-95,-98,-95.6,-88, -87.5,-88.9], 'name': ['mike', 'john', 'tyler', 'rob', 'ashley', 'john']})
df_new1 = pd.DataFrame({'location' : ['warehouse'], 'lat': [22], 'lon': [-50]}) # sample data row1
df = pd.concat([df_new1, df], sort=False).reset_index(drop = True)
print(df)
df_new2 = pd.DataFrame({'location' : ['abc'], 'lat': [28], 'name': ['abcd']}) # sample data row2
df = pd.concat([df_new2, df], sort=False).reset_index(drop = True)
print(df)
输出:
lat location lon name seq
0 22.0 warehouse -50.0 NaN NaN
0 29.0 cal -95.0 mike 0.0
1 29.1 cal -98.0 john 1.0
2 28.2 cal -95.6 tyler 2.0
3 15.2 il -88.0 rob 3.0
4 15.6 il -87.5 ashley 4.0
5 14.0 il -88.9 john 5.0
lat location name lon seq
0 28.0 abc abcd NaN NaN
1 22.0 warehouse NaN -50.0 NaN
2 29.0 cal mike -95.0 0.0
3 29.1 cal john -98.0 1.0
4 28.2 cal tyler -95.6 2.0
5 15.2 il rob -88.0 3.0
6 15.6 il ashley -87.5 4.0
7 14.0 il john -88.9 5.0
答案 1 :(得分:0)
您可以先将字典转换为列表字典:
dic = {k, [v] for k,v in dic.items()}
然后
pandas.concat([pandas.DataFrame(dic), df])