monthly_dividend
1994 10 NaN
11 NaN
12 NaN
12 NaN
...
2012 4 NaN
5 NaN
6 NaN
7 1.746622
8 1.607685
9 1.613936
10 1.620187
11 1.626125
12 1.632375
2013 1 1.667792
2 1.702897
3 1.738314
4 1.773731
5 1.808835
6 1.844252
Length: 225
我的代码与上面的内容类似。这是一个按DataFrame分组的,但是我想再次将它变成一个常规的TimeSeries。 asfreq('M')不再适用于分组,因此我不确定是否有一种简单的方法可以转换它。
dividends
1994-10-31 0.0750
1994-11-30 0.0750
1994-12-31 0.0750
1995-12-31 0.3450
...
2012-03-31 0.145812
2012-04-30 0.145812
2012-05-31 0.145812
2012-06-30 0.146125
2012-07-31 0.146125
2012-08-31 0.151125
2012-09-30 0.151438
2012-10-31 0.151438
2012-11-30 0.151438
2012-12-31 0.151750
2013-01-31 0.180917
2013-02-28 0.180917
2013-03-31 0.181229
2013-04-30 0.181229
2013-05-31 0.181229
Freq: M, Length: 224
答案 0 :(得分:1)
创建您的热门数据
In [172]: df = DataFrame(randn(200,1),columns=['A'],index=pd.date_range('2000',periods=200,freq='M'))
In [173]: df['month'] = df.index.month
In [174]: df['year'] = df.index.year
In [175]: df = df.reset_index(drop=True).set_index(['year','month'])
In [176]: df
Out[176]:
<class 'pandas.core.frame.DataFrame'>
MultiIndex: 200 entries, (2000, 7) to (2017, 2)
Data columns (total 1 columns):
A 200 non-null values
dtypes: float64(1)
In [177]: df.head()
Out[177]:
A
year month
2000 7 0.084256
8 2.507213
9 -0.642151
10 1.972307
11 0.926586
这将创建每月频率的PeriodIndex。请注意,迭代索引会产生元组(作为整数)
In [179]: pd.PeriodIndex([ pd.Period(year=year,month=month,freq='M') for year, month in df.index ])
Out[179]:
<class 'pandas.tseries.period.PeriodIndex'>
freq: M
[2000-07, ..., 2017-02]
length: 200
直接转换为DateTimeIndex
In [180]: new_index = pd.PeriodIndex([ pd.Period(year=year,month=month,freq='M') for year, month in df.index ]).to_timestamp()
Out[180]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-07-01 00:00:00, ..., 2017-02-01 00:00:00]
Length: 200, Freq: MS, Timezone: None
此时你可以做到
In [182]: df.index = new_index
In [183]: df
Out[183]:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 200 entries, 2000-07-01 00:00:00 to 2017-02-01 00:00:00
Freq: MS
Data columns (total 1 columns):
A 200 non-null values
dtypes: float64(1)
In [184]: df.head()
Out[184]:
A
2000-07-01 0.084256
2000-08-01 2.507213
2000-09-01 -0.642151
2000-10-01 1.972307
2000-11-01 0.926586
to_timestamp
通常会返回该月的第一天
返回结束,通过how='e'
In [1]: pr = pd.period_range('200001',periods=20,freq='M')
In [2]: pr
Out[2]:
<class 'pandas.tseries.period.PeriodIndex'>
freq: M
[2000-01, ..., 2001-08]
length: 20
In [3]: pr.to_timestamp()
Out[3]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-01 00:00:00, ..., 2001-08-01 00:00:00]
Length: 20, Freq: MS, Timezone: None
In [4]: pr.to_timestamp(how='e')
Out[4]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2000-01-31 00:00:00, ..., 2001-08-31 00:00:00]
Length: 20, Freq: M, Timezone: None