熊猫如何在重新采样的df中重置索引

时间:2020-07-27 14:24:32

标签: pandas

可能的新手问题。 我的每日股价为df;

print(df.head())

它打印以下内容:

                 High         Low        Open       Close    Volume   Adj Close       100ma       250ma
Date                                                                                                    
2015-01-02  314.750000  306.959991  312.579987  308.519989   2783200  308.519989  308.519989  308.519989
2015-01-05  308.380005  300.850006  307.010010  302.190002   2774200  302.190002  305.354996  305.354996
2015-01-06  303.000000  292.380005  302.239990  295.290009   3519000  295.290009  302.000000  302.000000
2015-01-07  301.279999  295.329987  297.500000  298.420013   2640300  298.420013  301.105003  301.105003
2015-01-08  303.140015  296.109985  300.320007  300.459991   3088400  300.459991  300.976001  300.976001

接下来,我想对其重新采样并将其更改为每周图表:

df_ohcl = df.resample('W',loffset=pd.offsets.timedelta(days=-6)).apply({
'Open': 'first', 'High': 'max', 'Low': 'min','Close': 'last', 'Volume': 'sum'})

它为我提供了正确的每周值:

                  Open        High        Low       Close   Volume
Date                    
2014-12-29  312.579987  314.750000  306.959991  308.519989  2783200
2015-01-05  307.010010  308.380005  292.380005  296.929993  14614300
2015-01-12  297.559998  301.500000  285.250000  290.739990  20993900
2015-01-19  292.589996  316.929993  286.390015  312.390015  22999200
2015-01-26  311.820007  359.500000  299.329987  354.529999  41666500

我现在想将此信息移至matplotlib, 并将日期转换为mdates版本。由于我只是要绘制Matplotlib中的列,所以我实际上不希望日期再成为索引,所以我尝试了:

df_ohlc.reset_index(inplace=True)

获取错误:

ValueError                                Traceback (most recent call last)
<ipython-input-149-6c0c324e68a8> in <module>
      5 '''
      6 
----> 7 df_ohlc.reset_index(inplace=True)
      8 
      9 df_ohcl.head()

~\anaconda3\lib\site-packages\pandas\core\frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
   4602                 # to ndarray and maybe infer different dtype
   4603                 level_values = _maybe_casted_values(lev, lab)
-> 4604                 new_obj.insert(0, name, level_values)
   4605 
   4606         new_obj.index = new_index

~\anaconda3\lib\site-packages\pandas\core\frame.py in insert(self, loc, column, value, allow_duplicates)
   3494         self._ensure_valid_index(value)
   3495         value = self._sanitize_column(column, value, broadcast=False)
-> 3496         self._data.insert(loc, column, value, allow_duplicates=allow_duplicates)
   3497 
   3498     def assign(self, **kwargs) -> "DataFrame":

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in insert(self, loc, item, value, allow_duplicates)
   1171         if not allow_duplicates and item in self.items:
   1172             # Should this be a different kind of error??
-> 1173             raise ValueError(f"cannot insert {item}, already exists")
   1174 
   1175         if not isinstance(loc, int):

ValueError: cannot insert ('level_0', ''), already exists

我该如何解决,使Date变成另一列?

在此先感谢您的帮助!

2 个答案:

答案 0 :(得分:0)

创建将保留日期信息的列

df['Date'] = df.index

将生成范围作为DataFrame的长度设置为索引

df.index = range(len(df))

答案 1 :(得分:0)

将日期列作为索引保留在MatPlotLib中进行绘图可能会很方便。这是一个示例:

首先,导入软件包并重新创建每周数据框架:

from io import StringIO
import numpy as np
import pandas as pd

data = '''Date Open High Low Close Volume
2014-12-29  312.579987  314.750000  306.959991  308.519989  2783200
2015-01-05  307.010010  308.380005  292.380005  296.929993  14614300
2015-01-12  297.559998  301.500000  285.250000  290.739990  20993900
2015-01-19  292.589996  316.929993  286.390015  312.390015  22999200
2015-01-26  311.820007  359.500000  299.329987  354.529999  41666500
'''

weekly = (pd.read_csv(StringIO(data), sep=' +', engine='python')
            .assign(Date = lambda x: pd.to_datetime(x['Date'], 
                                                    format='%Y-%m-%d', 
                                                    errors='coerce'))
            .set_index('Date'))

                  Open        High         Low       Close    Volume
Date                                                                
2014-12-29  312.579987  314.750000  306.959991  308.519989   2783200
2015-01-05  307.010010  308.380005  292.380005  296.929993  14614300
2015-01-12  297.559998  301.500000  285.250000  290.739990  20993900
2015-01-19  292.589996  316.929993  286.390015  312.390015  22999200
2015-01-26  311.820007  359.500000  299.329987  354.529999  41666500

接下来,创建图。索引(日期,以周为单位)成为x轴。

import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(12, 9))

for field in ['Open', 'High', 'Low', 'Close']:
    ax.plot(weekly[field], label=field)
    
ax.legend()
plt.show();