麻烦策划熊猫系列

时间:2015-11-10 15:11:00

标签: python pandas matplotlib

我有一个Pandas数据系列,我想基于这样的几个月绘制: enter image description here

以下是代码:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Import text file which holds ImageStack's path
df = pd.read_csv('file.txt', header=None)
df.columns = ['paths']

# Columns 2=Year, 3=Month, 4=Day
df = df.ix[:, 1:4].astype(int)

df.columns = ['year', 'month', 'day']

count = df.groupby(['year', 'month']).count()
count.columns = ['count']

months = ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')

nmonths = len(months)
fig, ax = plt.subplots()
ind = np.arange(nmonths)
width = 0.45

nyears = len(count.index.levels[0])
p = []
for year in count.index.levels[0]:
    df_empty = pd.DataFrame({'months': ind})
    monthly_sum = df_empty.join(count.ix[year, :].groupby(level=0)
                                .sum())['count'].values
    counts_until = df_empty.join(count.ix[:year, :].groupby(level=1)
                                 .sum())['count'].values
    p.append(ax.bar(ind, monthly_sum, width,
                    bottom=counts_until, alpha=0.8))

# Set x axis ticks and labels
ax.set_xticks(ind + width/2)
ax.set_xticklabels([months[i] for i in ind])

# Locate legend outside axes plot area
box = ax.get_position()
ax.set_position([box.x0, box.y0, box.width * 0.8, box.height])
ax.legend([pl[0] for pl in p], count.index.levels[0], loc='center left',
          bbox_to_anchor=(1, 0.5))

plt.show()

数据

      year  month  day
0     2007      5    6
1     2007      5    6
2     2007      5    8
3     2007      5    8
4     2007      5   12
5     2007      5   15
6     2007      5   16
7     2007      5   19
8     2007      5   21
9     2007      5   21
10    2007      5   21
11    2007      5   24

但它没有给我一个正确的堆积条形图! 我认为问题可能来自 counts_until 有没有人可以找到解决这个问题的方法!?

1 个答案:

答案 0 :(得分:3)

如果您的数据采用适当的格式,则可以直接使用plot(kind='bar')方法。用一个虚拟的例子:

In [81]: s = pd.Series(np.random.randint(2, size=2000), pd.date_range('2010-01-01', periods=2000))

In [82]: counts = s.groupby([s.index.year, s.index.month]).sum()

In [83]: counts
Out[83]:
2010  1     10
      2     22
      3     19
      4     16
      5     16
      6     16
      7     19
      8     14
...
2015  1     16
      2     16
      3     17
      4     15
      5     15
      6     11
dtype: int32

如果您确保'年'是不同的列(例如使用unstack),那么您可以指定stacked=True来表示应该堆叠不同的列:

In [86]: counts.unstack(level=0).plot(kind='bar', stacked=True)
Out[86]: <matplotlib.axes._subplots.AxesSubplot at 0x10bfb4a8>

给出:

enter image description here