如何匹配此DataFrame中的值source
:
car_id lat lon
0 100 10.0 15.0
1 100 12.0 10.0
2 100 09.0 08.0
3 110 23.0 12.0
4 110 18.0 32.0
5 110 21.0 16.0
5 110 12.0 02.0
并且仅保留那些其坐标位于第二个DataFrame中的人coords
:
lat lon
0 12.0 10.0
1 23.0 12.0
3 18.0 32.0
因此生成的DataFrame result
为:
car_id lat lon
1 100 12.0 10.0
3 110 23.0 12.0
4 110 18.0 32.0
我可以用apply
以迭代的方式做到这一点,但我正在寻找一种矢量化方式。我使用isin()
尝试了以下操作但没有成功:
result = source[source[['lat', 'lon']].isin({
'lat': coords['lat'],
'lon': coords['lon']
})]
上述方法返回:
ValueError: ('operands could not be broadcast together with shapes (53103,) (53103,2)
答案 0 :(得分:3)
DataFrame.merge()合并所有具有相同名称的列(两个DF的列的交集):
In [197]: source.merge(coords)
Out[197]:
car_id lat lon
0 100 12.0 10.0
1 110 23.0 12.0
2 110 18.0 32.0
答案 1 :(得分:3)
a = source.values
b = coords.values
out = source[(a[:,1:]==b[:,None]).all(-1).any(0)]
示例运行 -
In [74]: source
Out[74]:
car_id lat lon
0 100 10.0 15.0
1 100 12.0 10.0
2 100 9.0 8.0
3 110 23.0 12.0
4 110 18.0 32.0
5 110 21.0 16.0
5 110 12.0 2.0
In [75]: coords
Out[75]:
lat lon
0 12.0 10.0
1 23.0 12.0
3 18.0 32.0
In [76]: a = source.values
...: b = coords.values
...:
In [77]: source[(a[:,1:]==b[:,None]).all(-1).any(0)]
Out[77]:
car_id lat lon
1 100 12.0 10.0
3 110 23.0 12.0
4 110 18.0 32.0