我正在尝试将新的观察结果串联起来。我得到的答案是我认为是正确的,但仍然让系统回到我身边说:“ ValueError 只能比较标记相同的DataFrame对象”,谁能告诉我为什么我认为我得到了正确的结果,为什么会有值错误?
这是问题:
假设数据框Employee如下:
Department Title Year Education Sex
Name
Bob IT analyst 1 Bachelor M
Sam Trade associate 3 PHD M
Peter HR VP 8 Master M
Jake IT analyst 2 Master M
和另一个数据框new_observations是:
Department Education Sex Title Year
Mary IT F VP 9.0
Amy ? PHD F associate 5.0
Jennifer Trade Master F associate NaN
John HR Master M analyst 2.0
Judy HR Bachelor F analyst 2.0
使用这些新观察结果更新Employee。
这是我的代码:
import pandas as pd
Employee =pd.DataFrame({"Name":["Bob","Sam","Peter","Jake"],
"Education":["Bachelor","PHD","Master","Master"],
"Sex":["M","M","M","M"],
"Year":[1,3,8,2],
"Department":["IT","Trade","HR","IT"],
"Title":["analyst", "associate", "VP", "analyst"]})
Employee=Employee.set_index('Name')
new_observations = pd.DataFrame({
"Name": ["Mary","Amy","Jennifer","John","Judy"],
"Department":["IT","?","Trade","HR","HR"],
"Education":["","PHD","Master","Master","Bachelor"],
"Sex":["F","F","F","M","F"],
"Title":["VP","associate","associate","analyst","analyst"],
"Year":[9.0,5.0,"NaN",2.0,2.0]},
columns=
["Name","Department","Education","Sex","Title","Year"])
new_observations=new_observations.set_index('Name')
Employee = Employee.append(new_observations,sort=False)
这是我的结果:
我也尝试过
Employee = pd.concat([Employee, new_observations], axis = 1, sort=False)
答案 0 :(得分:0)
在pd.concat
上使用axis=0
,这是默认设置,因此您不需要包括轴:
pd.concat([Employee, new_observations], sort=False)
输出:
Education Sex Year Department Title
Name
Bob Bachelor M 1 IT analyst
Sam PHD M 3 Trade associate
Peter Master M 8 HR VP
Jake Master M 2 IT analyst
Mary F 9 IT VP
Amy PHD F 5 ? associate
Jennifer Master F NaN Trade associate
John Master M 2 HR analyst
Judy Bachelor F 2 HR analyst