我有一个包含7个功能的表格。最后一个是时间戳。我只想将时间序列数据划分为每个10分钟的相等时隙。所以我可以检查哪个实例属于哪个插槽。
答案 0 :(得分:1)
import pandas as pd
import datetime as dt
# Create reproduceable example data
# (in the future, it's better if you do this in your question)
first_timestamp = pd.to_datetime('1/1/2011 00:00')
timestamps = pd.date_range(first_timestamp, periods=100, freq='1Min')
other_data = np.random.randint(0,10,size=(100,))
df = pd.DataFrame({'timestamp': timestamps,
'other_data': other_data})
# Create a timedelta of minutes since first timestamp
# timedeltas have attributes for days and seconds, but not minutes.
df['minutes_since_start'] = (df['timestamp'] - first_timestamp).dt.seconds / 60
# Create groups
df['timestamp group'] = pd.cut(df['minutes_since_start'], bins=range(0,101, 10),include_lowest=True)
# first 3 entries
df.head(5)
输出:
other_data timestamp minutes_since_start timestamp group
0 8 2011-01-01 00:00:00 0.0 [0, 10]
1 5 2011-01-01 00:01:00 1.0 [0, 10]
2 7 2011-01-01 00:02:00 2.0 [0, 10]
从任意时间戳组获取数据,例如开始后70-80分钟
df[df['timestamp group'] == '(70, 80]']
输出:
other_data timestamp minutes_since_start timestamp group
71 1 2011-01-01 01:11:00 71.0 (70, 80]
72 8 2011-01-01 01:12:00 72.0 (70, 80]
73 3 2011-01-01 01:13:00 73.0 (70, 80]
74 0 2011-01-01 01:14:00 74.0 (70, 80]
75 8 2011-01-01 01:15:00 75.0 (70, 80]
76 8 2011-01-01 01:16:00 76.0 (70, 80]
77 0 2011-01-01 01:17:00 77.0 (70, 80]
78 6 2011-01-01 01:18:00 78.0 (70, 80]
79 0 2011-01-01 01:19:00 79.0 (70, 80]
80 5 2011-01-01 01:20:00 80.0 (70, 80]