asfreq()返回一个空的数据框

时间:2018-06-28 15:20:03

标签: python pandas

我有一个带有DateTimeIndex的数据框:

import pandas as pd
from pandas.tseries.offsets import *

data = pd.read_excel('P:\\Simon\\govt_bond_yields.xlsx')
data.head()

              USA   Italy   UK   EURO ZONE  GREECE  GERMANY
2018-06-25  2.8748  2.782   1.299   0.327   4.102   0.327
2018-06-22  2.8949  2.694   1.319   0.335   4.114   0.335
2018-06-21  2.8967  2.732   1.277   0.333   4.279   0.333
2018-06-20  2.9389  2.549   1.297   0.375   4.332   0.375
2018-06-19  2.8967  2.557   1.283   0.370   4.344   0.370

目前我的索引没有频率

data.index

DatetimeIndex(['2018-06-25', '2018-06-22', '2018-06-21', '2018-06-20',
               '2018-06-19', '2018-06-18', '2018-06-15', '2018-06-14',
               '2018-06-13', '2018-06-12',
               ...
               '2015-01-27', '2015-01-26', '2015-01-23', '2015-01-22',
               '2015-01-21', '2015-01-20', '2015-01-16', '2015-01-15',
               '2015-01-14', '2015-01-13'],
              dtype='datetime64[ns]', length=862, freq=None)

我正在尝试设置索引频率,但是这样做之后,我得到一个空的数据框

data.asfreq(freq='D')

USA Italy UK EURO ZONE  GREECE  GERMANY

我在这里做什么错了?

3 个答案:

答案 0 :(得分:5)

如果您首先对索引进行排序,这应该可以工作,因为asfreq很难知道您要怎么做。例如:

# Unsorted data with a datetime index:
>>> data
               USA  Italy     UK  EURO ZONE  GREECE  GERMANY
2018-06-25  2.8748  2.782  1.299      0.327   4.102    0.327
2018-06-22  2.8949  2.694  1.319      0.335   4.114    0.335
2018-06-21  2.8967  2.732  1.277      0.333   4.279    0.333
2018-06-20  2.9389  2.549  1.297      0.375   4.332    0.375
2018-06-19  2.8967  2.557  1.283      0.370   4.344    0.370

>>> data.sort_index().asfreq(freq='D')
               USA  Italy     UK  EURO ZONE  GREECE  GERMANY
2018-06-19  2.8967  2.557  1.283      0.370   4.344    0.370
2018-06-20  2.9389  2.549  1.297      0.375   4.332    0.375
2018-06-21  2.8967  2.732  1.277      0.333   4.279    0.333
2018-06-22  2.8949  2.694  1.319      0.335   4.114    0.335
2018-06-23     NaN    NaN    NaN        NaN     NaN      NaN
2018-06-24     NaN    NaN    NaN        NaN     NaN      NaN
2018-06-25  2.8748  2.782  1.299      0.327   4.102    0.327

您可以检查索引以确保其有效:

# Check the index:
>>> data.sort_index().asfreq(freq='D').index
DatetimeIndex(['2018-06-19', '2018-06-20', '2018-06-21', '2018-06-22',
               '2018-06-23', '2018-06-24', '2018-06-25'],
              dtype='datetime64[ns]', freq='D')

答案 1 :(得分:2)

IIUC,我想您想做的是resampleasfreq

data.resample('D').asfreq()

输出:

               USA  Italy     UK  EURO ZONE  GREECE  GERMANY
2018-06-19  2.8967  2.557  1.283      0.370   4.344    0.370
2018-06-20  2.9389  2.549  1.297      0.375   4.332    0.375
2018-06-21  2.8967  2.732  1.277      0.333   4.279    0.333
2018-06-22  2.8949  2.694  1.319      0.335   4.114    0.335
2018-06-23     NaN    NaN    NaN        NaN     NaN      NaN
2018-06-24     NaN    NaN    NaN        NaN     NaN      NaN
2018-06-25  2.8748  2.782  1.299      0.327   4.102    0.327

答案 2 :(得分:0)

这是对Sacul答案的改进

df2 = datatable_df.sort_index()。asfreq(freq ='YS')

note:按照sacul的命令,原始数据帧不会受到任何影响。因此,我认为该步骤是错误的。然后,我尝试将结果赋值给一个值,它起作用了。我得以继续进行时间序列研究。