我正在尝试使用pd.merge合并两个数据帧。如果两个数据框中都存在公司名称,我想从数据框2添加“电话”列。每次我运行代码时,即使这是我要合并的列名,我也会得到KeyError ='Company'。
我尝试过的事情:
pd.merge(northstar,julie['Phone'], on='Company')
northstar.merge(julie['Phone'], on 'Company')
pd.merge('Company','Title','First Name','Last Name','Address1','Address2','City','State','Zip','Mail Stop','Industry','Service1','Service2','Service3','Service4','P2','Fax or Ext.','Email','Secondary Contact','Secondary Email','Call Appt','Notes'],julie['Phone'],on='Company')
我得到KeyError='Company' everytime.
import pandas as pd
northstar= pd.read_csv('/home/amypeiper/Downloads/northstar_dw_2018_q12019.csv')
Company Title ... Call Appt Notes
0 24M TECHNOLOGIES MECH ENGINEER ... NaN NaN
1 3D SYSTEMS INC COMMODITY MGR ... NaN NaN
2 3M ENG ... NaN NaN
3 A & E INC PROD ENGR ... NaN NaN
4 A. W. CHESTERTON COMPANY PROCUREMENT MGR ... NaN NaN
5 ABB SR MFG ENGINEER ... NaN NaN
6 ABBOTT LABORATORIES CALIBRATION ENGR ... NaN NaN
7 ABBOTT MACHINE CO BUYER/DRAFTSMAN ... NaN NaN
julie= pd.read_csv('/home/amypeiper/Downloads/from_julie.csv')
[1457 rows x 24 columns]
company Title ... Service4 Priority
0 24M TECHNOLOGIES MECH ENGINEER ... 99 NaN
1 3M ENG ... 95 NaN
2 4D DESIGN LLC DESIGN ENGINEER ... 37 NaN
3 A & E INC PROD ENGR ... 41 NaN
4 ABB SR MFG ENGINEER ... 52 NaN
northstar['Company'].isin(julie['Company']).value_counts()
result = pd.merge(northstar['Company','Title','First Name','Last Name','Address1',
'Address2','City','State','Zip','Mail Stop','Industry','Service1','Service2','Service3','Service4',
'P2','Fax or Ext.','Email','Secondary Contact','Secondary Email','Call Appt','Notes'],julie['Phone'],on='Company')
我希望得到一个数据框名称结果,该结果包含来自northstar数据框的所有列以及来自julie数据框的“电话”列。 我一直收到同样的错误:
Traceback (most recent call last):
File "<ipython-input-11-e230d033a0e2>", line 8, in <module>
northstar['Company'].isin(julie['Company']).value_counts()
File "/home/amypeiper/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2927, in __getitem__
indexer = self.columns.get_loc(key)
File "/home/amypeiper/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2659, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Company'
答案 0 :(得分:0)
juilie['Phone']
仅给您Phone
中的julie
列。您还需要包括Company
列:
pd.merge(northstar, julie[['Phone','Company']]), on='Company')
答案 1 :(得分:0)
如果您想在 Company 列上进行合并,则涉及到两者 DataFrame必须具有此列。
因此合并应类似于:
pd.merge(northstar['Company', ...], julie['Company', 'Phone'], on='Company',
how='left')
请注意,默认合并模式( how )是 inner 。你必须改变它, 否则结果将只包含两个DataFrame中都存在的公司。
如果您希望结果包含 northstar 中的所有列,则可以省略 此DataFrame的列列表。另一方面,我在您的代码中看到了 您包括了列列表(您似乎想省略 一些现有的列)。