我有一个如下所示的Pandas DataFrame:
>>> df
Start_Time End_Time
0 2014-10-16 15:05:17 2014-10-16 17:13:14
1 2014-10-16 14:56:37 2014-10-16 15:07:17
2 2014-10-16 14:25:16 2014-10-16 18:06:17
...
现在,我有另一个包含多个时间戳的DataFrame:
>>> times
Time
0 2014-10-16 15:17:17
1 2014-10-16 14:53:37
2 2014-10-16 14:26:16
...
我最后想要接收的是行数,其中Start_Time 我当然可以通过遍历时间并使用loc来创建sub_dfs来实现: 但是这非常耗时,并且感觉不是最佳的。有没有一种方法可以不重复进行此操作?
提前非常感谢你们!>>> times
Time Count
0 2014-10-16 15:17:17 1
1 2014-10-16 15:05:37 2
2 2014-10-16 14:26:16 1
...
ls_len = []
for index, row in times.iterrows():
sub_df = df.loc[(df['Start_Time']<row['Time']) & (df['End_Time']>row['Time'])]
ls_len.append(len(sub_df))
times['Count'] = ls_len
答案 0 :(得分:1)
#This is more optimal than looping
def count_val(x):
sub_df = df.loc[(df['Start_Time']<x['Time']) & (df['End_Time']>x['Time'])]
count = len(sub_df)
return count
times['count'] = times.apply(count_val, axis=1)