我正在尝试以下方法:
PoliceStations_raw=pd.DataFrame(
[['BAYVIEW' ,37.729732,-122.397981],
['CENTRAL' ,37.798732,-122.409919],
['INGLESIDE' ,37.724676,-122.446215],
['MISSION' ,37.762849,-122.422005],
['NORTHERN' ,37.780186,-122.432467],
['PARK' ,37.767797,-122.455287],
['RICHMOND' ,37.779928,-122.464467],
['SOUTHERN' ,37.772380,-122.389412],
['TARAVAL' ,37.743733,-122.481500],
['TENDERLOIN',37.783674,-122.412899]],columns=['PdDistrict','XX','YY'])
df1=pd.DataFrame([[0,'CENTRAL'],[1,'TARAVAL'],[3,'CENTRAL'],[2,'BAYVIEW']])
df1.columns = ['Index','PdDistrict']
Index PdDistrict
0 0 CENTRAL
1 1 TARAVAL
2 3 CENTRAL
3 2 BAYVIEW
尽管输入了sort=False
,但返回的对象已经合并了表,但是使用PdDistrict
作为索引,并且更改了原始左数据帧的行的顺序。
pd.merge(df1,PoliceStations_raw,sort=False)
返回此值(请注意PdDistrict
的顺序已更改)
Index PdDistrict XX YY
0 0 CENTRAL 37.798732 -122.409919
1 3 CENTRAL 37.798732 -122.409919
2 1 TARAVAL 37.743733 -122.481500
3 2 BAYVIEW 37.729732 -122.397981
答案 0 :(得分:5)
您需要指定两个数据帧合并的方式。默认情况下,内部联接由merge()
模拟。但是,通过指定您想要左连接,将保留df1
的排序顺序。因此,您只需添加how='left'
:
>>> pd.merge(df1, PoliceStations_raw, how='left')
Index PdDistrict XX YY
0 0 CENTRAL 37.798732 -122.409919
1 1 TARAVAL 37.743733 -122.481500
2 3 CENTRAL 37.798732 -122.409919
3 2 BAYVIEW 37.729732 -122.397981
此外,sort=False
是默认行为 - 您无需指定该行为。