如果我在一小时的样本中创建了一些随机数据。
import pandas as pd
import numpy as np
from numpy.random import randint
np.random.seed(10) # added for reproductibility
rng = pd.date_range('10/9/2018 00:00', periods=1000, freq='1H')
df = pd.DataFrame({'Random_Number':randint(1, 100, 1000)}, index=rng)
我可以每天使用groupby
进行一次活动:
for idx, day in df.groupby(df.index.date):
print(day)
现在是否有一种方法可以根据以小时为单位的时间戳来计算每日min
和max
之间的时间差?每天记录每天的最小,最大和时间差吗?
答案 0 :(得分:2)
经过讨论(感谢@Erfan):
(df.Random_Number
.groupby(df.index.date)
.agg(['idxmin','idxmax'])
.diff(axis=1).iloc[:,1]
.div(pd.to_timedelta('1H'))
)
输出:
2018-10-09 -4.0
2018-10-10 -1.0
2018-10-11 -4.0
2018-10-12 12.0
2018-10-13 21.0
2018-10-14 6.0
2018-10-15 -6.0
2018-10-16 -18.0
2018-10-17 -8.0
2018-10-18 9.0
2018-10-19 -10.0
2018-10-20 3.0
2018-10-21 10.0
2018-10-22 2.0
2018-10-23 9.0
2018-10-24 2.0
2018-10-25 3.0
2018-10-26 2.0
2018-10-27 -22.0
2018-10-28 6.0
2018-10-29 -8.0
2018-10-30 -1.0
2018-10-31 -11.0
2018-11-01 19.0
2018-11-02 7.0
2018-11-03 4.0
2018-11-04 18.0
2018-11-05 -1.0
2018-11-06 15.0
2018-11-07 -14.0
2018-11-08 -16.0
2018-11-09 -2.0
2018-11-10 -7.0
2018-11-11 -14.0
2018-11-12 12.0
2018-11-13 -14.0
2018-11-14 2.0
2018-11-15 2.0
2018-11-16 6.0
2018-11-17 -7.0
2018-11-18 5.0
2018-11-19 9.0
Name: idxmax, dtype: float64
答案 1 :(得分:0)
或者,如果要保留所有带有数据帧输出的列,请考虑合并到聚合数据集上:
# ADJUST FOR DATETIME AND DATE AS COLUMNS
df = (df.reset_index()
.assign(date = df.index.date)
)
# AGGREGATION + MERGE ON MIN/MAX + CALCULATION
agg_df = (df.groupby('date')['Random_Number']
.agg(["min", "max"])
.reset_index()
.merge(df, left_on=['date', 'max'], right_on=['date', 'Random_Number'])
.merge(df, left_on=['date', 'min'], right_on=['date', 'Random_Number'],
suffixes=['', '_min'])
.assign(diff = lambda x: (x['index'] - x['index_min']) / pd.to_timedelta('1H'))
)
print(agg_df.head(10))
# date min max index Random_Number index_min Random_Number_min diff
# 0 2018-10-09 1 94 2018-10-09 05:00:00 94 2018-10-09 09:00:00 1 -4.0
# 1 2018-10-10 12 95 2018-10-10 20:00:00 95 2018-10-10 21:00:00 12 -1.0
# 2 2018-10-11 5 97 2018-10-11 15:00:00 97 2018-10-11 19:00:00 5 -4.0
# 3 2018-10-12 7 98 2018-10-12 18:00:00 98 2018-10-12 06:00:00 7 12.0
# 4 2018-10-13 1 91 2018-10-13 22:00:00 91 2018-10-13 01:00:00 1 21.0
# 5 2018-10-14 1 97 2018-10-14 10:00:00 97 2018-10-14 04:00:00 1 6.0
# 6 2018-10-15 9 97 2018-10-15 06:00:00 97 2018-10-15 12:00:00 9 -6.0
# 7 2018-10-16 3 95 2018-10-16 04:00:00 95 2018-10-16 22:00:00 3 -18.0
# 8 2018-10-17 2 95 2018-10-17 13:00:00 95 2018-10-17 21:00:00 2 -8.0
# 9 2018-10-18 1 91 2018-10-18 21:00:00 91 2018-10-18 12:00:00 1 9.0