我有以下dfs。
df1
Date Dollar EURO GBP
12/03/2019 100 80 90
12/02/2019 101 81 89
12/01/2019 1000 79 91
df2
Product Currency Rate Date
ABC EURO 80 12/03/2019
xyz USD 105 11/30/2019
ert GBP 90 11/29/2019
基本上我想要的是在df2中放置一个新列(具有该货币和日期的汇率),如果df1和df2的汇率匹配或不匹配,则进一步添加一列。
Product Currency Rate Date df1.rate Check
ABC EURO 80 12/03/2019 80 Match
xyz USD 105 11/30/2019 N/A Not Match
ert GBP 90 11/29/2019 N/A Not Match
我尝试过。
USD = df2['Currency'] == "US $"
GBP = df2['Currency'] == "GBP"
EURO = df2['Currency'] == "EURO"
if USD:
df2['Check'] = df2['rate'] == df1['Dollar']
elif GBP:
df2['Check'] = df2['rate'] == df1['GBP']
elif EURO:
df2['Check'] = df2['rate'] == df1['EURO']
在第1行出现以下错误。
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
请提出补救措施。
答案 0 :(得分:3)
将DataFrame.melt
与带有左联接和indicator
参数的DataFrame.merge
一起使用,最后由numpy.where
设置值:
df = df1.melt('Date', var_name='Currency', value_name='Rate')
df2 = df2.merge(df, how='left', indicator='Check')
mask = df2['Check'].eq('both')
df2['Check'] = np.where(mask, 'Match','Not Match')
print (df2)
Product Currency Rate Date Check
0 ABC EURO 80 12/03/2019 Match
1 xyz USD 105 11/30/2019 Not Match
2 ert GBP 90 11/29/2019 Not Match
如果还需要在列df1.rate
中添加DataFrame.insert
:
df2.insert(len(df2.columns)-1, 'df1.rate', df2['Rate'].where(mask))
print (df2)
Product Currency Rate Date df1.rate Check
0 ABC EURO 80 12/03/2019 80.0 Match
1 xyz USD 105 11/30/2019 NaN Not Match
2 ert GBP 90 11/29/2019 NaN Not Match
详细说明:将df1
的相同格式的数据df2
改成melt
:
print (df1.melt('Date', var_name='Currency', value_name='Rate'))
Date Currency Rate
0 12/03/2019 Dollar 100
1 12/02/2019 Dollar 101
2 12/01/2019 Dollar 1000
3 12/03/2019 EURO 80
4 12/02/2019 EURO 81
5 12/01/2019 EURO 79
6 12/03/2019 GBP 90
7 12/02/2019 GBP 89
然后merge
使用左连接,指标参数创建两个带有信息的新列,如果两个或左数据帧中都匹配:
print (df2.merge(df, how='left', indicator='Check'))
Product Currency Rate Date Check
0 ABC EURO 80 12/03/2019 both
1 xyz USD 105 11/30/2019 left_only
2 ert GBP 90 11/29/2019 left_only
最后用mask
替换值:
df2['Check'] = np.where(mask, 'Match','Not Match')
print (df2)
Product Currency Rate Date Check
0 ABC EURO 80 12/03/2019 Match
1 xyz USD 105 11/30/2019 Not Match
2 ert GBP 90 11/29/2019 Not Match