一系列matplotlib子图中的最终绘图增加了y刻度标签填充

时间:2017-06-14 11:33:49

标签: python python-3.x matplotlib

我有一个Pandas数据框,其中包含代表年份,年份内的月份和二进制结果(0/1)的列。我想用一个条形图每年绘制一列条形图。我在matplotlib.pyplot中使用了subplots()函数,其中sharex = True,而sharey = True。图表看起来很好,除了y-ticks和y-tick标签之间的填充在最终(底部)图表上是不同的。

可以按如下方式创建示例数据框:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Generate dataframe containing dates over several years and a random binary outcome
tempDF = pd.DataFrame()
tempDF['date'] = pd.date_range(start = pd.to_datetime('2014-01-01'),end = pd.to_datetime('2017-12-31'))
tempDF['case'] = np.random.choice([0,1],size = [len(tempDF.index)],p = [0.9,0.1])

# Create a dataframe that summarises proportion of cases per calendar month
tempGroupbyCalendarMonthDF = tempDF.groupby([tempDF['date'].dt.year,tempDF['date'].dt.month]).agg({'case': sum,
                                                                                                   'date': 'count'})
tempGroupbyCalendarMonthDF.index.names = ['year','month']
tempGroupbyCalendarMonthDF = tempGroupbyCalendarMonthDF.reset_index()

# Rename columns to something more meaningful
tempGroupbyCalendarMonthDF = tempGroupbyCalendarMonthDF.rename(columns = {'case': 'numberCases',
                                                                          'date': 'monthlyTotal'})

# Calculate percentage positive cases per month
tempGroupbyCalendarMonthDF['percentCases'] = (tempGroupbyCalendarMonthDF['numberCases']/tempGroupbyCalendarMonthDF['monthlyTotal'])*100

最终的数据框如下所示:

    year  month  monthlyTotal  numberCases  percentCases
0   2014      1            31            5     16.129032
1   2014      2            28            5     17.857143
2   2014      3            31            3      9.677419
3   2014      4            30            1      3.333333
4   2014      5            31            4     12.903226
..   ...    ...           ...          ...           ...
43  2017      8            31            2      6.451613
44  2017      9            30            2      6.666667
45  2017     10            31            3      9.677419
46  2017     11            30            2      6.666667
47  2017     12            31            1      3.225806

然后产生如下图所示的图。 subplots()函数用于返回轴数组。代码逐步遍历每个轴并绘制值。 x轴刻度和标签仅显示在最终(底部)图上。最后,获得一个共同的y轴标签,添加一个覆盖所有条形图的附加子图,但不显示所有轴和标签(y轴标签除外)。

# Calculate minimumn and maximum years in dataset
minYear = tempDF['date'].min().year
maxYear = tempDF['date'].max().year

# Set a few parameters
barWidth = 0.80
labelPositionX = 0.872
labelPositionY = 0.60

numberSubplots = maxYear - minYear + 1

fig, axArr = plt.subplots(numberSubplots,figsize = [8,10],sharex = True,sharey = True)

# Keep track of which year to plot, starting with first year in dataset.
currYear = minYear

# Step through each subplot
for ax in axArr:

    # Plot the data
    rects = ax.bar(tempGroupbyCalendarMonthDF.loc[tempGroupbyCalendarMonthDF['year'] == currYear,'month'],
                   tempGroupbyCalendarMonthDF.loc[tempGroupbyCalendarMonthDF['year'] == currYear,'percentCases'],
                   width = barWidth)

    # Format the axes
    ax.set_xlim([0.8,13])
    ax.set_ylim([0,40])

    ax.grid(True)

    ax.tick_params(axis = 'both',
                   left = 'on',
                   bottom = 'off',
                   top = 'off',
                   right = 'off',
                   direction = 'out',
                   length = 4,
                   width = 2,
                   labelsize = 14)

    # Turn on the x-axis ticks and labels for final plot only
    if currYear == maxYear:
        ax.tick_params(bottom = 'on')
        xtickPositions = [1,2,3,4,5,6,7,8,9,10,11,12]
        ax.set_xticks([x + barWidth/2 for x in xtickPositions])
        ax.set_xticklabels(['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'])

    ytickPositions = [10,20,30]
    ax.set_yticks(ytickPositions)

    # Add label (in this case, the year) to each subplot
    # (The transform = ax.transAxes makes positios relative to current axis.)
    ax.text(labelPositionX, labelPositionY,currYear,
            horizontalalignment = 'center',
            verticalalignment = 'center',
            transform = ax.transAxes,
            family = 'sans-serif',
            fontweight = 'bold',
            fontsize = 18,
            color = 'gray')

    # Onto the next year...
    currYear = currYear + 1

# Fine-tune overall figure
# ========================
# Make subplots close to each other.
fig.subplots_adjust(hspace=0)

# To display a common y-axis label, create a large axis that covers all the subplots
fig.add_subplot(111, frameon=False)

# Hide tick and tick label of the big axss
plt.tick_params(labelcolor='none', top='off', bottom='off', left='off', right='off')

# Add y label that spans several subplots
plt.ylabel("Percentage cases (%)", fontsize = 16, labelpad = 20)

plt.show()

Example plot produce by code showing problem with y-tick labels on final plot

生成的图形几乎正是我想要的,但底部图表上的y轴刻度标签与轴相比,与其他所有图形相比更远。如果产生的地块数量发生变化(通过使用更宽范围的日期),则会出现相同的模式,即,只有最终的地块看起来不同。

我几乎肯定没有看到树木,但是有人能发现我做错了吗?

修改 上面的代码是在Matplotlib 1.4.3上原始运行的(参见ImportanceOfBeingErnest的评论)。但是,当更新到Matplotlib 2.0.2时,代码无法运行(KeyError:0)。原因似乎是Matplotlib 2.xxx中的默认设置是条形对齐中心。要使上面的代码运行,要么调整x轴范围并勾选位置,以使条形图不超出y轴,或者在绘图函数中设置align ='center',即:

rects = ax.bar(tempGroupbyCalendarMonthDF.loc[tempGroupbyCalendarMonthDF['year'] == currYear,'month'],
                       tempGroupbyCalendarMonthDF.loc[tempGroupbyCalendarMonthDF['year'] == currYear,'percentCases'],
                       width = barWidth,
                       align = 'edge')

0 个答案:

没有答案