使用Matplotlib绘制多索引数据框时,x轴标签出错

时间:2015-11-13 10:18:03

标签: python pandas matplotlib

我有一个时间序列数据框,我从datetime列计算了一个季节列。然后我按“季节”和“年份”将数据框索引,并想要绘制结果。代码如下:

In [1]: measurement = np.array([1,2,3,4,5,5,6,7,8,5,4,5])
   ...: lower_bound = 3.5
   ...: higher_bound = 5.5 
   ...: N = 3
   ...: 

In [2]: m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]

In [3]: m2D # Notice that is a 2D array (shifted) version of input
Out[3]: 
array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5],
       [4, 5, 5],
       [5, 5, 6],
       [5, 6, 7],
       [6, 7, 8],
       [7, 8, 5],
       [8, 5, 4],
       [5, 4, 5]])

In [4]: np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Out[4]: 3

不幸的是,这在绘制x轴标签时给出了错误:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

dates = pd.date_range('20070101',periods=1000)
df = pd.DataFrame(np.random.randn(1000), columns =list ('A'))
df['date'] = dates

def get_season(row):
    if row['date'].month >= 3 and row['date'].month <= 5:
        return 'spring'
    elif row['date'].month >= 6 and row['date'].month <= 8:
        return 'summer'
    elif row['date'].month >= 9 and row['date'].month <= 11:
        return 'autumn'
    else:
       return 'winter'

df['Season'] = df.apply(get_season, axis=1)
df['Year'] = df['date'].dt.year
df.loc[df['date'].dt.month == 12, 'Year'] += 1
df = df.set_index(['Year', 'Season'], inplace=False)

df.head()

fig,ax = plt.subplots()
df.plot(x_compat=True,ax=ax)

ax.xaxis.set_tick_params(reset=True)
ax.xaxis.set_major_locator(mdates.YearLocator(1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y'))

plt.show()

我希望只将年份视为x轴标签,而不是年份和季节。

我确信这样做很简单,我做错了,但我无法弄清楚是什么......

修改

更改df.plot函数稍微更好地绘制日期,但仍然是几个月,我宁愿只有一年,但这比以前略好。

新代码:

File "C:\Users\myname\AppData\Local\Continuum\Anaconda\lib\site-packages\matplotlib\dates.py", line 225, in _from_ordinalf
dt = datetime.datetime.fromordinal(ix)

ValueError: ordinal must be >= 1

1 个答案:

答案 0 :(得分:1)

不幸的是,pandasmatplotlib时间定位器/格式化程序之间的结合永远不会令人满意。最一致的方法是将日期时间数据放在numpy的{​​{1}} array中,并将其直接绘制在datetime中。 matplotlib提供了一个不错的pandas方法:

.to_pydatetime()

enter image description here