Question

我有两个数据帧，当第一个数据帧的'CommonName'列与第二个数据帧的'Name'列匹配到第二个数据帧时，我想附加'lat'和'lon'列值。< / p>

第一个数据框是'AllBusStops'并采用以下形式：

AllBusStops = {'CommonName': ['Cambuslang Road', 'Hillsborough Road'],'lon': [-4.17351, -4.12914], 'lat': [55.82932, 55.85388]}

第二个数据框是'SixtyOne'并采用以下形式：

SixtyOne = {'Name': ['Canonbie Street', 'Hillsborough Road']}

因此，在上面的示例中，来自AllBusStops数据帧的'lat'，'lon'值将附加到Hillsborough Road的SixtyOne数据框中。

到目前为止代码看起来像这样：

for i in range(len(AllBusStops)):
   for j in range(len(SixtyOne)):
        if AllBusStops[['CommonName']][i] == SixtyOne[['Name']][j]:
           Lat = AllBusStops[['Lat']][i]
           Lon = AllBusStops[['Lon']][i]

当我执行此操作时，我收到以下消息：

KeyError: 0

During handling of the above exception, another exception occurred

Answer 1

我对数据的结构并不完全清楚，但听起来你想要合并两个数据帧中的数据。请参阅DataFrame.merge功能。

此代码将返回类似SixtyOne的数据框，并插入“Lat”和“Lon”列。

# The value of the 'how' parameter depends on your needs; 
# see documentation for 'merge'

combined = SixtyOne.merge(AllBusStops[['CommonName', 'Lat', 'Lon']],
                          left_on='Name',
                          right_on'CommonName',
                          how='left')

如评论中所述，您应该阅读how所采用的merge参数;如果您感到困惑，可以在线搜索“SQL left outer join”等短语。

上面的代码使用左连接，这与您的代码段略有不同。但我怀疑你在这种情况下实际上想要一个左连接，例如这样您就可以在SixtyOne中观察合并中没有纬度和经度值的记录。

Answer 2

假设您的DataFrame看起来像这样：

d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
     'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)

>>> df
   one  two
a  1.0  1.0
b  2.0  2.0
c  3.0  3.0
d  NaN  4.0
>>>

当您访问这样的列（AllBusStops[['CommonName']]）时，它会生成一个DataFrame（您可能想要一个系列）

>>> z = df[['one']]
>>> type(z)
<class 'pandas.core.frame.DataFrame'>
>>> z
   one
a  1.0
b  2.0
c  3.0
d  NaN
>>>

然后你尝试获得第一个带有整数索引（AllBusStops[['CommonName']][i]）的项，它产生KeyError - DataFrame期待一个标签。

>>> z[0]
Traceback (most recent call last):
  File "C:\Python36\lib\site-packages\pandas\core\indexes\base.py", line 2442, in get_loc
...
KeyError: 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<pyshell#288>", line 1, in <module>
    z[0]
...
KeyError: 0

您可以通过访问此列

来检索系列

>>> q = df['one']
>>> type(q)
<class 'pandas.core.series.Series'>
>>> q
a    1.0
b    2.0
c    3.0
d    NaN
Name: one, dtype: float64
>>>

然后检索系列中的第一项

>>> q[0]
1.0
>>>

请注意，我创建了一个简单的DataFrame，并尝试模仿您的程序步骤，看看我是否可以重现该问题。这是Minimal, Complete, and Verifiable Example（mcve） - 您应该阅读它。有时为自己制作一个mcve（或在这里发布）会为你解决问题或者让你更容易弄清楚自己。
How to debug small programs

正如@ NichloasM的答案所提到的，你可能想探索合并/加入你的数据。熊猫拥有出色的文档 - Merge, Join, and concatenate

Python：搜索两个数据帧，并在数据匹配时附加到新数据帧

2 个答案: