如何查找值是按顺序还是按时间顺序出现?

时间:2019-08-30 14:54:36

标签: python python-3.x pandas datetime python-datetime

我在下面给出了两个数据框供您测试

df_1 = pd.DataFrame({
'subject_id':[1,1,1,1,1,1,1,1,1,1,1],
'time_1' :['2173-04-03 10:00:00','2173-04-03 10:15:00','2173-04-03 10:30:00','2173-04-03 10:45:00','2173-04-03 11:01:00','2173-04-04 12:00:00','2173-04-05 16:00:00','2173-04-05 22:00:00','2173-04-06 04:00:00','2173-04-06 04:30:00','2173-04-06 06:30:00'],
'val' :[5,5,5,5,5,10,5,8,3,8,10]
})

df_2 = pd.DataFrame({
'subject_id':[1,1,1,1,1,1,1,1,1,1,1],
'time_1' :['2173-04-03 10:00:00','2173-04-03 10:15:00','2173-04-03 10:30:00','2173-04-03 10:45:00','2173-04-03 11:01:00','2173-04-04 12:00:00','2173-04-05 16:00:00','2173-04-05 22:00:00','2173-04-06 04:00:00','2173-04-06 04:30:00','2173-04-06 06:30:00'],
'val' :[5,6,5,6,5,10,5,8,3,8,10]
 })

我正在尝试检查val中的值是否顺序(时间顺序)。我的意思是一个值不间断出现(例如:5,5,5是一个序列(时间顺序),而5,6,5,6是5的序列被破坏的示例)。你能帮我找到那个吗?

这是我尝试的总和和持续时间,但是它不起作用

df['time_1']= pd.to_datetime(df1['time_1'])
s=pd.to_timedelta(24,unit='h')-(df.time_1-df.time_1.dt.normalize())
df['tdiff'] = 
df.groupby(df.time_1.dt.date).time_1.diff().shift(-1).fillna(s)
df['t_d'] = df['tdiff'].dt.total_seconds()/3600
df['date'] = df['time_1'].dt.date
df.groupby(['val','date'],sort=False)['t_d'].agg({'cumduration':sum,'freq':'count'}).reset_index()

我希望我的df_2输出是这样的。

enter image description here

1 个答案:

答案 0 :(得分:0)

您的开销太大了。将这些步骤填充到单行命令中:

  1. 将[“ val”]左移一个位置...
  2. ...用<= ...比较该移位序列与[“ val”]的关系。
  3. 这为您提供了一系列布尔值;将all()应用于此

all()的结果告诉您它们是否按降序排列。

编码留给读者练习。 :-)