如何在两个相同的时间戳熊猫中创建一个数据帧

时间:2017-03-05 11:15:26

标签: python pandas dataframe

我有两个pandas dataFrames:

fileX = pd.DataFrame({'cas':pd.date_range('2017-03-02 22:52:00', periods=5, freq='T'),
                      'hodnota': [[1]] * 5})
fileX.cas = fileX.cas.dt.strftime('%Y-%m-%d %H:%M')
print (fileX)
                cas hodnota
0  2017-03-02 22:52     [1]
1  2017-03-02 22:53     [1]
2  2017-03-02 22:54     [1]
3  2017-03-02 22:55     [1]
4  2017-03-02 22:56     [1]

real = pd.DataFrame({'timeStampRoundedToMinute':pd.date_range('2017-03-02 22:52:00', 
                                                              periods=5, freq='T'),
                      'received_optical_power': [-25.0] * 5,
                      'binary_y':[1] * 5})
real = real[['timeStampRoundedToMinute','received_optical_power','binary_y']]
real.timeStampRoundedToMinute = real.timeStampRoundedToMinute.dt.strftime('%Y-%m-%d %H:%M')
print (real)
  timeStampRoundedToMinute  received_optical_power  binary_y
0         2017-03-02 22:52                   -25.0         1
1         2017-03-02 22:53                   -25.0         1
2         2017-03-02 22:54                   -25.0         1
3         2017-03-02 22:55                   -25.0         1
4         2017-03-02 22:56                   -25.0         1

我将真实作为参考,因此我的目标是获取所有值 来自真实 fileX 数据框,其中 cas == timeStampRoundedToMinute

我能够做到这样的事情:

output = real[real.timeStampRoundedToMinute.isin(fileX['cas'])]
print (output)
  timeStampRoundedToMinute  received_optical_power  binary_y
0         2017-03-02 22:52                   -25.0         1
1         2017-03-02 22:53                   -25.0         1
2         2017-03-02 22:54                   -25.0         1
3         2017-03-02 22:55                   -25.0         1
4         2017-03-02 22:56                   -25.0         1

所以我错过了输出数据框中的另一列 hodnota 属于 fileX 数据框

的列

1 个答案:

答案 0 :(得分:1)

您似乎需要merge默认inner加入:

df = pd.merge(fileX, real, left_on='cas', right_on='timeStampRoundedToMinute')