熊猫和matplotlib堆叠的条形图,其中主要和次要x标记组合在一起

时间:2019-02-13 12:43:05

标签: python pandas matplotlib

我有以下数据:

id, approach, outcome
a1, approach1, outcome1
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach1, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach2, outcome1
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach3, outcome2
a2, approach3, outcome2
a2, approach3, outcome1
a2, approach3, outcome2
a2, approach3, outcome1

我从另一个用户那里找到了以下图表,这正是我要完成的工作: enter image description here

但是,没有水果,没有ID,而没有年份,却有方法。

这是我到目前为止所做的:

df = pandas.read_csv("test.txt", sep=r',\s+', engine = "python")
fig, ax = plt.subplots(1, 1, figsize=(5.5, 4))

data = df[df.approach == "approach1"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=0.6, color=["g", "r"], stacked=True, ax=ax)

data = df[df.approach == "approach2"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=-0.6, color=["g", "r"], stacked=True, ax=ax)

# "Activate" minor ticks
ax.minorticks_on()

rects_locs = []
p = 0
for patch in ax.patches:
    rects_locs.append(patch.get_x() + patch.get_width())
    # p += 0.01

# Set minor ticks there
ax.set_xticks(rects_locs, minor = True)

# Labels for the rectangles
new_ticks = ["Approach1"] * 10 + ["Approach2"] * 10

# Set the labels
from matplotlib import ticker
ax.xaxis.set_minor_formatter(ticker.FixedFormatter(new_ticks))  #add the custom ticks

# Move the category label further from x-axis
ax.tick_params(axis='x', which='major', pad=15)

# Remove minor ticks where not necessary
ax.tick_params(axis='x',which='both', top='off')
ax.tick_params(axis='y',which='both', left='off', right = 'off')
plt.xticks(rotation=0)

但是输出不是很好: enter image description here

所以基本上我想将id作为主要的x标记(因此应该有2个这样的x值),然后对于每个id,应该有3个成组的堆叠条形图(approach1,approach2,approach3)。

1 个答案:

答案 0 :(得分:1)

嗯,我不为此感到骄傲。但这有效。希望有更多知识的人会提供更好的解决方案。

我首先设置您的数据:

import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import numpy as np
import pandas as pd

data = np.array([
'id', 'approach', 'outcome',
'a1', 'approach1', 'outcome1',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1'])

data = data.reshape(data.size // 3, 3)

df = pd.DataFrame(data[1:], columns=data[0])

接下来,我对每种方法和ID的所有"outcome1""outcome2"进行计数。 (我敢肯定这可以直接在熊猫中完成,但是我有点熊猫新手):

dict = {}

for id in 'a1', 'a2':
    dict[id] = {}
    for approach in 'approach1', 'approach2', 'approach3':
        dict[id][approach] = {}
        for outcome in 'outcome1', 'outcome2':
            dict[id][approach][outcome] = ((df['id'] == id)
                                         & (df['approach'] == approach)
                                         & (df['outcome'] == outcome)).sum()

plot_data = pd.DataFrame(dict)

现在剩下要做的就是绘图了。

fig, ax = plt.subplots(1, 1)

i = 0
for id in 'a1', 'a2':
    for approach in 'approach1', 'approach2', 'approach3':
        ax.bar(i, plot_data[id][approach]["outcome1"], color='g')
        ax.bar(i, plot_data[id][approach]["outcome2"],
               bottom=plot_data[id][approach]["outcome1"], color='r')
        i += 1
    i+=1

ax.set_xticklabels(['', 'approach1', 'approach2', 'approach3', '',
                    'approach1', 'approach2', 'approach3'], rotation=45)

custom_lines = [Line2D([0], [0], color='g', lw=4),
                Line2D([0], [0], color='r', lw=4)]

ax.legend(custom_lines, ['Outcome 1', 'Outcome 2'])

enter image description here