如何从日期时间索引中删除时间戳?

时间:2019-03-15 11:23:35

标签: python pandas time-series

我正在处理抖动的时间序列,我想使用pandas.DatetimeIndex.snap方法将时间戳捕捉到标称频率。 这是生成抖动数据的代码:

 import pandas as pd
 import numpy as np 

 start_date='2018-01-01'
 rate = 10
 jitter=.05
 num_rows=100
 num_cols = 3

 frequency = 1 / rate
 indices = pd.date_range(
            start=start_date,
            periods=num_rows,
            freq=pd.DateOffset(seconds=frequency))
 jitter = frequency * jitter
 deltas = pd.to_timedelta(
            np.random.uniform(-jitter, jitter, num_rows), unit='s')
 indices = indices + deltas
 rows = np.random.rand(num_rows, num_cols)
 data = pd.DataFrame(rows, indices)

我知道了:

 data =
                                   0         1         2
2018-01-01 00:00:00.001242896  0.156529  0.366638  0.619121
2018-01-01 00:00:00.101054078  0.159395  0.968022  0.914749
2018-01-01 00:00:00.192294840  0.166950  0.121155  0.085408
2018-01-01 00:00:00.292522754  0.909444  0.193624  0.416285
2018-01-01 00:00:00.400590672  0.448851  0.599011  0.071735
2018-01-01 00:00:00.495377230  0.226759  0.931490  0.908410

我想做类似data.snap("10Hz")的事情……但是:

  • data.freq为None
  • 我无法在方法snap中指定频率

期望的输出是抖动的数据帧,即。

                                   0         1         2
2018-01-01 00:00:00.000000000  0.156529  0.366638  0.619121
2018-01-01 00:00:00.100000000  0.159395  0.968022  0.914749
2018-01-01 00:00:00.200000000  0.166950  0.121155  0.085408
2018-01-01 00:00:00.300000000  0.909444  0.193624  0.416285
2018-01-01 00:00:00.400000000  0.448851  0.599011  0.071735
2018-01-01 00:00:00.500000000  0.226759  0.931490  0.908410

有什么主意吗?

非常感谢。

1 个答案:

答案 0 :(得分:2)

我相信您需要100ms前的DatetimeIndex.round

data.index = data.index.round('100L')
print (data.head(10))

                                0         1         2
2018-01-01 00:00:00.000  0.875417  0.786886  0.299583
2018-01-01 00:00:00.100  0.671108  0.295735  0.482092
2018-01-01 00:00:00.200  0.685071  0.795047  0.420373
2018-01-01 00:00:00.300  0.487898  0.919015  0.815932
2018-01-01 00:00:00.400  0.004191  0.085291  0.919271
2018-01-01 00:00:00.500  0.529557  0.380357  0.903027
2018-01-01 00:00:00.600  0.470609  0.225200  0.504134
2018-01-01 00:00:00.700  0.685757  0.648768  0.510639
2018-01-01 00:00:00.800  0.016022  0.301982  0.432702
2018-01-01 00:00:00.900  0.681281  0.910646  0.519735

我尝试使用函数DatetimeIndex.snap,但无法设置freq,它总是返回相同的输出(也许默认freq='S'无法更改或出现错误)

rate = 10
jitter=.05
num_rows=100
num_cols = 3
start_date = '2018-01-01'

np.random.seed(123)
frequency = 1 / rate
indices = pd.date_range(
        start=start_date,
        periods=num_rows,
        freq=pd.DateOffset(seconds=frequency))
jitter = frequency * jitter
deltas = pd.to_timedelta(
        np.random.uniform(-jitter, jitter, num_rows), unit='s')
indices = indices + deltas
rows = np.random.rand(num_rows, num_cols)
data = pd.DataFrame(rows, indices)
print (data.head())

print (data.index.snap(freq='S')[:10])
DatetimeIndex(['2018-01-01 00:00:00.001964', '2018-01-01 00:00:00.097861',
               '2018-01-01 00:00:00.197268', '2018-01-01 00:00:00.300513',
               '2018-01-01 00:00:00.402194', '2018-01-01 00:00:00.499231',
               '2018-01-01 00:00:00.604807', '2018-01-01 00:00:00.701848',
               '2018-01-01 00:00:00.799809', '2018-01-01 00:00:00.898921'],
              dtype='datetime64[ns]', freq='100L')

print (data.index.snap(freq='100S')[:10])
DatetimeIndex(['2018-01-01 00:00:00.001964', '2018-01-01 00:00:00.097861',
               '2018-01-01 00:00:00.197268', '2018-01-01 00:00:00.300513',
               '2018-01-01 00:00:00.402194', '2018-01-01 00:00:00.499231',
               '2018-01-01 00:00:00.604807', '2018-01-01 00:00:00.701848',
               '2018-01-01 00:00:00.799809', '2018-01-01 00:00:00.898921'],
              dtype='datetime64[ns]', freq='100L')

print (data.index.snap(freq='100L')[:10])
DatetimeIndex(['2018-01-01 00:00:00.001964', '2018-01-01 00:00:00.097861',
               '2018-01-01 00:00:00.197268', '2018-01-01 00:00:00.300513',
               '2018-01-01 00:00:00.402194', '2018-01-01 00:00:00.499231',
               '2018-01-01 00:00:00.604807', '2018-01-01 00:00:00.701848',
               '2018-01-01 00:00:00.799809', '2018-01-01 00:00:00.898921'],
              dtype='datetime64[ns]', freq='100L')