我只是在公共列中合并两个数据帧:
import NumberChecker
let arguments = CommandLine.arguments
if arguments.count != 3 {
print("USAGE: PayBackCodingChallenge [data] [target]")
print(" data: File containing list of numbers ")
print(" target: Target number")
} else {
let data = arguments[1]
let target = arguments[2]
print(data + " " + target)
}
在这种情况下,我丢失了数据。我怎么能避免这个?
答案 0 :(得分:3)
pd.merge(df1,df2,on='accountname',how='left')
或
pd.merge(df1,df2,on='accountname',how='inner')
编辑:
让我们看看你的样本数据,用merge
str和int。这就是为什么所有NaN
df1.applymap(type)
Out[96]:
email account
0 <class 'str'> <class 'str'>
1 <class 'str'> <class 'str'>
2 <class 'str'> <class 'str'>
3 <class 'str'> <class 'str'>
df2.applymap(type)
Out[97]:
account
ip
1.1.1.1 <class 'int'>
2.2.2.2 <class 'int'>
如何做到这一点:
选项1
使用str
numeric
更改为pd.to_numeric
df1.account=pd.to_numeric(df1.account,errors ='coerce')
df1.applymap(type)
Out[99]:
email account
0 <class 'str'> <class 'float'>
1 <class 'str'> <class 'float'>
2 <class 'str'> <class 'float'>
3 <class 'str'> <class 'float'>
df1.merge(df2.reset_index(),on=['account'],how='left')
Out[101]:
email account ip
0 555@i555.com 555 1.1.1.1
1 666@666.com 666 2.2.2.2
2 777@666.com NaN NaN
3 888@666.com 999 NaN
选项2
我们只是将df2.account
更改为str
(我更喜欢使用第一个pd.to-numeric
)
df2.account=df2.account.astype(str)
df1.merge(df2.reset_index(),on=['account'],how='left')
Out[105]:
email account ip
0 555@i555.com 555 1.1.1.1
1 666@666.com 666 2.2.2.2
2 777@666.com Nan NaN
3 888@666.com 999 NaN