循环使用一个pandas列以匹配具有另一个数据帧的索引的值

时间:2017-07-06 20:54:27

标签: python pandas

Exp只有DataFrame datetime object

           Exp
0   1989-06-01
1   1989-07-01
2   1989-08-01
3   1989-09-01
4   1989-10-01

CLDataframeIndexDateTime Object

                    CL
1989-06-01   68.800026
1989-06-04   68.620026
1989-06-05   68.930023
1989-06-06   68.990021
1989-06-09   69.110023
  • 我想在R数据框中添加新列CL,这些数据框的日期将与Exp匹配CL索引。

这就是我想要的输出应该是什么

                   CL          R

1989-06-01   68.800026   1989-06-01
1989-06-04   68.620026
1989-06-05   68.930023
1989-06-06   68.990021
1989-06-09   69.110023

这是我尝试过的:

for m in Exp.iloc[:,0]:
if m == CL.index:
    CL['R'] = m
  

ValueError:具有多个元素的数组的真值   暧昧。使用a.any()或a.all()

有人可以帮帮我吗?我经常收到这个ValueError

2 个答案:

答案 0 :(得分:2)

修改:根据评论者建议更新。

你需要做LEFT JOIN:

Exp = pd.DataFrame(
    pd.to_datetime(['1989-06-01', '1989-07-01', '1989-08-01', '1989-09-01', '1989-10-01']),
    columns=['Exp'])

给出:

          Exp
0  1989-06-01
1  1989-07-01
2  1989-08-01
3  1989-09-01
4  1989-10-01

CL = pd.DataFrame(
[68.800026, 68.620026, 68.930023, 68.990021, 69.110023],
index = pd.to_datetime(['1989-06-01', '1989-06-04', '1989-06-05', '1989-06-06', '1989-06-09']),
columns = ['CL'])

给出

                   CL
1989-06-01  68.800026
1989-06-04  68.620026
1989-06-05  68.930023
1989-06-06  68.990021
1989-06-09  69.110023

然后:

(CL
 .reset_index()
 .merge(Exp, how='left', right_on='Exp', left_on='index')
 .set_index('index')
 .rename(columns={'Exp': 'R'}))

返回您要找的内容

                   CL           R
index                            
1989-06-01  68.800026  1989-06-01
1989-06-04  68.620026         NaN
1989-06-05  68.930023         NaN
1989-06-06  68.990021         NaN
1989-06-09  69.110023         NaN

因为循环数据帧不是Pandas的做事方式。

答案 1 :(得分:0)

<强> pd.DataFrame.join
join侧重于通过索引组合数据框/系列 在set_index上使用Expdrop=False,在数据框中保留相同的信息和索引。我们将它放在索引中以使join方便。

CL.join(Exp.set_index('Exp', drop=False)).rename(columns=dict(Exp='R'))

                   CL          R
1989-06-01  68.800026 1989-06-01
1989-06-04  68.620026        NaT
1989-06-05  68.930023        NaT
1989-06-06  68.990021        NaT
1989-06-09  69.110023        NaT

设置

Exp = pd.DataFrame(dict(
        Exp=pd.to_datetime(
            ['1989-06-01', '1989-07-01', '1989-08-01', '1989-09-01', '1989-10-01'])
    ))

CL = pd.DataFrame(dict(
        CL=[68.800026, 68.620026, 68.930023, 68.990021, 69.110023],
    ), pd.to_datetime(
        ['1989-06-01', '1989-06-04', '1989-06-05', '1989-06-06', '1989-06-09']))