我正在处理抖动的时间序列,我想使用pandas.DatetimeIndex.snap方法将时间戳捕捉到标称频率。 这是生成抖动数据的代码:
import pandas as pd
import numpy as np
start_date='2018-01-01'
rate = 10
jitter=.05
num_rows=100
num_cols = 3
frequency = 1 / rate
indices = pd.date_range(
start=start_date,
periods=num_rows,
freq=pd.DateOffset(seconds=frequency))
jitter = frequency * jitter
deltas = pd.to_timedelta(
np.random.uniform(-jitter, jitter, num_rows), unit='s')
indices = indices + deltas
rows = np.random.rand(num_rows, num_cols)
data = pd.DataFrame(rows, indices)
我知道了:
data =
0 1 2
2018-01-01 00:00:00.001242896 0.156529 0.366638 0.619121
2018-01-01 00:00:00.101054078 0.159395 0.968022 0.914749
2018-01-01 00:00:00.192294840 0.166950 0.121155 0.085408
2018-01-01 00:00:00.292522754 0.909444 0.193624 0.416285
2018-01-01 00:00:00.400590672 0.448851 0.599011 0.071735
2018-01-01 00:00:00.495377230 0.226759 0.931490 0.908410
我想做类似data.snap("10Hz")
的事情……但是:
期望的输出是抖动的数据帧,即。
0 1 2
2018-01-01 00:00:00.000000000 0.156529 0.366638 0.619121
2018-01-01 00:00:00.100000000 0.159395 0.968022 0.914749
2018-01-01 00:00:00.200000000 0.166950 0.121155 0.085408
2018-01-01 00:00:00.300000000 0.909444 0.193624 0.416285
2018-01-01 00:00:00.400000000 0.448851 0.599011 0.071735
2018-01-01 00:00:00.500000000 0.226759 0.931490 0.908410
有什么主意吗?
非常感谢。
答案 0 :(得分:2)
我相信您需要100ms
前的DatetimeIndex.round
:
data.index = data.index.round('100L')
print (data.head(10))
0 1 2
2018-01-01 00:00:00.000 0.875417 0.786886 0.299583
2018-01-01 00:00:00.100 0.671108 0.295735 0.482092
2018-01-01 00:00:00.200 0.685071 0.795047 0.420373
2018-01-01 00:00:00.300 0.487898 0.919015 0.815932
2018-01-01 00:00:00.400 0.004191 0.085291 0.919271
2018-01-01 00:00:00.500 0.529557 0.380357 0.903027
2018-01-01 00:00:00.600 0.470609 0.225200 0.504134
2018-01-01 00:00:00.700 0.685757 0.648768 0.510639
2018-01-01 00:00:00.800 0.016022 0.301982 0.432702
2018-01-01 00:00:00.900 0.681281 0.910646 0.519735
我尝试使用函数DatetimeIndex.snap
,但无法设置freq
,它总是返回相同的输出(也许默认freq='S'
无法更改或出现错误)
rate = 10
jitter=.05
num_rows=100
num_cols = 3
start_date = '2018-01-01'
np.random.seed(123)
frequency = 1 / rate
indices = pd.date_range(
start=start_date,
periods=num_rows,
freq=pd.DateOffset(seconds=frequency))
jitter = frequency * jitter
deltas = pd.to_timedelta(
np.random.uniform(-jitter, jitter, num_rows), unit='s')
indices = indices + deltas
rows = np.random.rand(num_rows, num_cols)
data = pd.DataFrame(rows, indices)
print (data.head())
print (data.index.snap(freq='S')[:10])
DatetimeIndex(['2018-01-01 00:00:00.001964', '2018-01-01 00:00:00.097861',
'2018-01-01 00:00:00.197268', '2018-01-01 00:00:00.300513',
'2018-01-01 00:00:00.402194', '2018-01-01 00:00:00.499231',
'2018-01-01 00:00:00.604807', '2018-01-01 00:00:00.701848',
'2018-01-01 00:00:00.799809', '2018-01-01 00:00:00.898921'],
dtype='datetime64[ns]', freq='100L')
print (data.index.snap(freq='100S')[:10])
DatetimeIndex(['2018-01-01 00:00:00.001964', '2018-01-01 00:00:00.097861',
'2018-01-01 00:00:00.197268', '2018-01-01 00:00:00.300513',
'2018-01-01 00:00:00.402194', '2018-01-01 00:00:00.499231',
'2018-01-01 00:00:00.604807', '2018-01-01 00:00:00.701848',
'2018-01-01 00:00:00.799809', '2018-01-01 00:00:00.898921'],
dtype='datetime64[ns]', freq='100L')
print (data.index.snap(freq='100L')[:10])
DatetimeIndex(['2018-01-01 00:00:00.001964', '2018-01-01 00:00:00.097861',
'2018-01-01 00:00:00.197268', '2018-01-01 00:00:00.300513',
'2018-01-01 00:00:00.402194', '2018-01-01 00:00:00.499231',
'2018-01-01 00:00:00.604807', '2018-01-01 00:00:00.701848',
'2018-01-01 00:00:00.799809', '2018-01-01 00:00:00.898921'],
dtype='datetime64[ns]', freq='100L')