Pandas Merge中缺少数据

时间:2017-10-17 21:40:15

标签: python pandas

我只是在公共列中合并两个数据帧:

import NumberChecker

let arguments = CommandLine.arguments

if arguments.count != 3 {
  print("USAGE: PayBackCodingChallenge [data] [target]")
  print("  data: File containing list of numbers ")
  print("  target: Target number")
} else {
  let data = arguments[1]
  let target = arguments[2]
  print(data + " " + target)
}

在这种情况下,我丢失了数据。我怎么能避免这个?

1 个答案:

答案 0 :(得分:3)

pd.merge(df1,df2,on='accountname',how='left')

pd.merge(df1,df2,on='accountname',how='inner')

编辑: 让我们看看你的样本数据,用merge str和int。这就是为什么所有NaN

df1.applymap(type)
Out[96]: 
           email        account
0  <class 'str'>  <class 'str'>
1  <class 'str'>  <class 'str'>
2  <class 'str'>  <class 'str'>
3  <class 'str'>  <class 'str'>
df2.applymap(type)
Out[97]: 
               account
ip                    
1.1.1.1  <class 'int'>
2.2.2.2  <class 'int'>

如何做到这一点:

选项1

使用str

numeric更改为pd.to_numeric
df1.account=pd.to_numeric(df1.account,errors ='coerce')
df1.applymap(type)
Out[99]: 
           email          account
0  <class 'str'>  <class 'float'>
1  <class 'str'>  <class 'float'>
2  <class 'str'>  <class 'float'>
3  <class 'str'>  <class 'float'>

df1.merge(df2.reset_index(),on=['account'],how='left')


Out[101]: 
          email account       ip
0  555@i555.com     555  1.1.1.1
1   666@666.com     666  2.2.2.2
2   777@666.com     NaN      NaN
3   888@666.com     999      NaN

选项2

我们只是将df2.account更改为str(我更喜欢使用第一个pd.to-numeric

df2.account=df2.account.astype(str)
df1.merge(df2.reset_index(),on=['account'],how='left')
Out[105]: 
          email account       ip
0  555@i555.com     555  1.1.1.1
1   666@666.com     666  2.2.2.2
2   777@666.com     Nan      NaN
3   888@666.com     999      NaN