通过在python中迭代来条件匹配行

时间:2019-12-03 07:51:55

标签: python pandas dataframe

我有以下dfs。

df1

Date          Dollar  EURO  GBP
12/03/2019    100      80   90
12/02/2019    101      81   89
12/01/2019    1000     79   91

df2

Product   Currency  Rate   Date
ABC        EURO      80    12/03/2019
xyz        USD       105   11/30/2019
ert        GBP       90    11/29/2019

基本上我想要的是在df2中放置一个新列(具有该货币和日期的汇率),如果df1和df2的汇率匹配或不匹配,则进一步添加一列。

Product   Currency  Rate   Date          df1.rate     Check
ABC        EURO      80    12/03/2019    80           Match
xyz        USD       105   11/30/2019    N/A          Not Match
ert        GBP       90    11/29/2019    N/A          Not Match 

我尝试过。

USD = df2['Currency'] == "US $"
GBP = df2['Currency'] == "GBP"
EURO = df2['Currency'] == "EURO"

if USD:
    df2['Check'] = df2['rate'] == df1['Dollar']
elif GBP:
    df2['Check'] = df2['rate'] == df1['GBP']
elif EURO:
    df2['Check'] = df2['rate'] == df1['EURO']

在第1行出现以下错误。

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

请提出补救措施。

1 个答案:

答案 0 :(得分:3)

DataFrame.melt与带有左联接和indicator参数的DataFrame.merge一起使用,最后由numpy.where设置值:

df = df1.melt('Date', var_name='Currency', value_name='Rate')

df2 = df2.merge(df, how='left', indicator='Check')
mask = df2['Check'].eq('both')
df2['Check'] = np.where(mask, 'Match','Not Match')
print (df2)
  Product Currency  Rate        Date      Check
0     ABC     EURO    80  12/03/2019      Match
1     xyz      USD   105  11/30/2019  Not Match
2     ert      GBP    90  11/29/2019  Not Match

如果还需要在列df1.rate中添加DataFrame.insert

df2.insert(len(df2.columns)-1, 'df1.rate', df2['Rate'].where(mask))
print (df2)
  Product Currency  Rate        Date  df1.rate      Check
0     ABC     EURO    80  12/03/2019      80.0      Match
1     xyz      USD   105  11/30/2019       NaN  Not Match
2     ert      GBP    90  11/29/2019       NaN  Not Match

详细说明:将df1的相同格式的数据df2改成melt

print (df1.melt('Date', var_name='Currency', value_name='Rate'))
         Date Currency  Rate
0  12/03/2019   Dollar   100
1  12/02/2019   Dollar   101
2  12/01/2019   Dollar  1000
3  12/03/2019     EURO    80
4  12/02/2019     EURO    81
5  12/01/2019     EURO    79
6  12/03/2019      GBP    90
7  12/02/2019      GBP    89

然后merge使用左连接,指标参数创建两个带有信息的新列,如果两个或左数据帧中都匹配:

print (df2.merge(df, how='left', indicator='Check'))
  Product Currency  Rate        Date      Check
0     ABC     EURO    80  12/03/2019       both
1     xyz      USD   105  11/30/2019  left_only
2     ert      GBP    90  11/29/2019  left_only

最后用mask替换值:

df2['Check'] = np.where(mask, 'Match','Not Match')
print (df2)
  Product Currency  Rate        Date      Check
0     ABC     EURO    80  12/03/2019      Match
1     xyz      USD   105  11/30/2019  Not Match
2     ert      GBP    90  11/29/2019  Not Match