我有两个pandas dataFrames:
fileX = pd.DataFrame({'cas':pd.date_range('2017-03-02 22:52:00', periods=5, freq='T'),
'hodnota': [[1]] * 5})
fileX.cas = fileX.cas.dt.strftime('%Y-%m-%d %H:%M')
print (fileX)
cas hodnota
0 2017-03-02 22:52 [1]
1 2017-03-02 22:53 [1]
2 2017-03-02 22:54 [1]
3 2017-03-02 22:55 [1]
4 2017-03-02 22:56 [1]
real = pd.DataFrame({'timeStampRoundedToMinute':pd.date_range('2017-03-02 22:52:00',
periods=5, freq='T'),
'received_optical_power': [-25.0] * 5,
'binary_y':[1] * 5})
real = real[['timeStampRoundedToMinute','received_optical_power','binary_y']]
real.timeStampRoundedToMinute = real.timeStampRoundedToMinute.dt.strftime('%Y-%m-%d %H:%M')
print (real)
timeStampRoundedToMinute received_optical_power binary_y
0 2017-03-02 22:52 -25.0 1
1 2017-03-02 22:53 -25.0 1
2 2017-03-02 22:54 -25.0 1
3 2017-03-02 22:55 -25.0 1
4 2017-03-02 22:56 -25.0 1
我将真实作为参考,因此我的目标是获取所有值 来自真实和 fileX 数据框,其中 cas == timeStampRoundedToMinute 。
我能够做到这样的事情:
output = real[real.timeStampRoundedToMinute.isin(fileX['cas'])]
print (output)
timeStampRoundedToMinute received_optical_power binary_y
0 2017-03-02 22:52 -25.0 1
1 2017-03-02 22:53 -25.0 1
2 2017-03-02 22:54 -25.0 1
3 2017-03-02 22:55 -25.0 1
4 2017-03-02 22:56 -25.0 1
所以我错过了输出数据框中的另一列 hodnota 属于 fileX 数据框
的列答案 0 :(得分:1)
您似乎需要merge
默认inner
加入:
df = pd.merge(fileX, real, left_on='cas', right_on='timeStampRoundedToMinute')