没有迭代的两个数据框的交集

时间:2019-04-12 15:24:32

标签: python pandas

所以我有两个df

第一名:

    Latitude    Longitude   Area
0   -25.66026   28.0914     HappyPlace
1   -25.67923   28.10525    SadPlace
2   -30.68456   19.21694    AveragePlace
3   -30.12345   22.34256    CoolPlace
4   -15.12546   17.12365    BadPlace

第二名:

    Latitude    Longitude   Population
0   -25.66026   28.0914     5000
1   -25.14568   28.10525    1750
2   -30.68456   19.21694    6000
3   -30.65375   22.34256    8000
4   -15.90458   17.12365    5600

我想得到具有相同纬度/经度的地方,所以我知道人口。 最重要的是,我只需要交叉路口即可用于我的真实项目

产生df:

    Latitude    Longitude   Area
0   -25.66026   28.0914     HappyPlace
2   -30.68456   19.21694    AveragePlace

我尝试过:

pd.merge(df1, df2, on=['LATITUDE'], how='inner')

不工作会返回奇怪的df

set(df1['LATITUDE']).intersection(set(df2['LATITUDE'))
df1[(df1['LATITUDE'] == df2['LATITUDE'])]
df1.where(df1.LATITUDE == df2.LATITUDE)

所有返回ValueError:只能比较标记相同的Series对象

(实际Df很大,两列都是浮点数)

1 个答案:

答案 0 :(得分:1)

pd.merge()失败,并显示KeyError,因为LATITUDE是错误的密钥。

以下MCVE可以正常工作。

import pandas as pd
import numpy as np
print(pd.__version__)

df1_string = """-25.66026   28.0914     HappyPlace
-25.67923   28.10525    SadPlace
-30.68456   19.21694    AveragePlace
-30.12345   22.34256    CoolPlace
-15.12546   17.12365    BadPlace"""

df2_string = """-25.66026   28.0914     5000
-25.14568   28.10525    1750
-30.68456   19.21694    6000
-30.65375   22.34256    8000
-15.90458   17.12365    5600"""

df1 = pd.DataFrame([x.split() for x in df1_string.split('\n')], columns=['Latitude', 'Longitude', 'Population'])
df2 = pd.DataFrame([x.split() for x in df2_string.split('\n')], columns=['Latitude', 'Longitude', 'Population'])
result = pd.merge(df1, df2, on=['Latitude'], how='inner')
print(set(df1['Latitude']).intersection(set(df2['Latitude'])))
print(df1[(df1['Latitude'] == df2['Latitude'])])
print(df1.where(df1.Latitude == df2.Latitude))

print(result)

产生

0.24.2
{'-25.66026', '-30.68456'}
    Latitude Longitude    Population
0  -25.66026   28.0914    HappyPlace
2  -30.68456  19.21694  AveragePlace
    Latitude Longitude    Population
0  -25.66026   28.0914    HappyPlace
1        NaN       NaN           NaN
2  -30.68456  19.21694  AveragePlace
3        NaN       NaN           NaN
4        NaN       NaN           NaN
    Latitude Longitude_x  Population_x Longitude_y Population_y
0  -25.66026     28.0914    HappyPlace     28.0914         5000
1  -30.68456    19.21694  AveragePlace    19.21694         6000