尝试定义库存数据的月度和周度范围。下面的代码只适用于高,它适用于每月,但不适用于每周。当我尝试在每周的df中创建一个新列时,我得到所有NaN。此外,如果我使用变量而不是新列,我会得到正确的结果。
test = df['High'].resample('w',how='max')
print test
...
2015-03-01 212.24
2015-03-08 212.06
2015-03-15 208.79
2015-03-22 211.27
2015-03-29 211.11
2015-04-05 208.61
Freq: W-SUN, Name: High, Length: 70
df['WHigh'] = df['High'].resample('w',how='max')
print df['WHigh']
...
2015-03-26 NaN
2015-03-27 NaN
2015-03-30 NaN
2015-03-31 NaN
2015-04-01 NaN
2015-04-02 NaN
Name: WHigh, Length: 336
答案 0 :(得分:1)
问题是原始索引与重新采样索引不同,因此您无法将其分配回原始DataFrame(作为列)。
In [11]: df = pd.DataFrame([1, 2, 3, 4, 5, 6], pd.date_range('2015-01-01', periods=6))
In [12]: df
Out[12]:
0
2015-01-01 1
2015-01-02 2
2015-01-03 3
2015-01-04 4
2015-01-05 5
2015-01-06 6
In [13]: df.resample('W')
Out[13]:
0
2015-01-04 2.5
2015-01-11 5.5
In [14]: df['weekly'] = df.resample('W')
In [15]: df
Out[15]:
0 weekly
2015-01-01 1 NaN
2015-01-02 2 NaN
2015-01-03 3 NaN
2015-01-04 4 2.5
2015-01-05 5 NaN
2015-01-06 6 NaN
看到只填写与周代表相匹配的那一天,其他一切都是NaN。
如果要将该周中的所有值设置为均值/最大值,请使用转换:
In [21]: df.groupby(pd.TimeGrouper('W')).transform('mean')
Out[21]:
0
2015-01-01 2
2015-01-02 2
2015-01-03 2
2015-01-04 2
2015-01-05 5
2015-01-06 5
注意:可能这里有一个错误,它应该是浮动IMO!
In [22]: df.astype('float64').groupby(pd.TimeGrouper('W')).transform('mean')
Out[22]:
0
2015-01-01 2.5
2015-01-02 2.5
2015-01-03 2.5
2015-01-04 2.5
2015-01-05 5.5
2015-01-06 5.5