下面是我的设置的简化版本:
import pandas as pd
import datetime as dt
df_data = pd.DataFrame({'DateTime' : [dt.datetime(2017, 9, 1, 0, 0, 0),dt.datetime(2017, 9, 1, 1, 0, 0),dt.datetime(2017, 9, 1, 2, 0, 0),dt.datetime(2017, 9, 1, 3, 0, 0)], 'Data' : [1,2,3,5]})
df_timeRanges = pd.DataFrame({'startTime':[dt.datetime(2017, 8, 30, 0, 0, 0), dt.datetime(2017, 9, 1, 1, 30, 0)], 'endTime':[dt.datetime(2017, 9, 1, 0, 30, 0), dt.datetime(2017, 9, 1, 2, 30, 0)]})
print df_data
print df_timeRanges
这给出了:
Data DateTime
0 1 2017-09-01 00:00:00
1 2 2017-09-01 01:00:00
2 3 2017-09-01 02:00:00
3 5 2017-09-01 03:00:00
endTime startTime
0 2017-09-01 00:30:00 2017-08-30 00:00:00
1 2017-09-01 02:30:00 2017-09-01 01:30:00
我想使用df_data
过滤df_timeRanges
,其余行位于单个数据框中,有点像:
df_data_filt = df_data[(df_data['DateTime'] >= df_timeRanges['startTime']) & (df_data['DateTime'] <= df_timeRanges['endTime'])]
我没想到上面的代码行,并且它返回了这个错误:
ValueError: Can only compare identically-labeled Series objects
有人能提供一些关于此的提示吗?我真正任务中的df_data
和df_timeRanges
要大得多。
提前致谢
答案 0 :(得分:0)
IIUIC,使用
$newData = [];
foreach ($yourArray as $innerArray) {
foreach ($innerArray as $key => $value) {
$newData[$key][] = $value;
}
}
或者
In [794]: mask = np.logical_or.reduce([
(df_data.DateTime >= x.startTime) & (df_data.DateTime <= x.endTime)
for i, x in df_timeRanges.iterrows()])
In [795]: df_data[mask]
Out[795]:
Data DateTime
0 1 2017-09-01 00:00:00
2 3 2017-09-01 02:00:00