我有以下数据:
id, approach, outcome
a1, approach1, outcome1
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach1, outcome2
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach2, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a1, approach3, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach1, outcome1
a2, approach1, outcome2
a2, approach1, outcome1
a2, approach2, outcome1
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach2, outcome1
a2, approach2, outcome2
a2, approach3, outcome2
a2, approach3, outcome2
a2, approach3, outcome1
a2, approach3, outcome2
a2, approach3, outcome1
但是,没有水果,没有ID,而没有年份,却有方法。
这是我到目前为止所做的:
df = pandas.read_csv("test.txt", sep=r',\s+', engine = "python")
fig, ax = plt.subplots(1, 1, figsize=(5.5, 4))
data = df[df.approach == "approach1"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=0.6, color=["g", "r"], stacked=True, ax=ax)
data = df[df.approach == "approach2"].groupby(["id", "outcome"], sort=False)["outcome"].count().unstack(level=1)
data.plot.bar(width=0.5, position=-0.6, color=["g", "r"], stacked=True, ax=ax)
# "Activate" minor ticks
ax.minorticks_on()
rects_locs = []
p = 0
for patch in ax.patches:
rects_locs.append(patch.get_x() + patch.get_width())
# p += 0.01
# Set minor ticks there
ax.set_xticks(rects_locs, minor = True)
# Labels for the rectangles
new_ticks = ["Approach1"] * 10 + ["Approach2"] * 10
# Set the labels
from matplotlib import ticker
ax.xaxis.set_minor_formatter(ticker.FixedFormatter(new_ticks)) #add the custom ticks
# Move the category label further from x-axis
ax.tick_params(axis='x', which='major', pad=15)
# Remove minor ticks where not necessary
ax.tick_params(axis='x',which='both', top='off')
ax.tick_params(axis='y',which='both', left='off', right = 'off')
plt.xticks(rotation=0)
所以基本上我想将id
作为主要的x标记(因此应该有2个这样的x值),然后对于每个id,应该有3个成组的堆叠条形图(approach1,approach2,approach3)。
答案 0 :(得分:1)
嗯,我不为此感到骄傲。但这有效。希望有更多知识的人会提供更好的解决方案。
我首先设置您的数据:
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import numpy as np
import pandas as pd
data = np.array([
'id', 'approach', 'outcome',
'a1', 'approach1', 'outcome1',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach1', 'outcome2',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach2', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a1', 'approach3', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome1',
'a2', 'approach1', 'outcome2',
'a2', 'approach1', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach2', 'outcome1',
'a2', 'approach2', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1',
'a2', 'approach3', 'outcome2',
'a2', 'approach3', 'outcome1'])
data = data.reshape(data.size // 3, 3)
df = pd.DataFrame(data[1:], columns=data[0])
接下来,我对每种方法和ID的所有"outcome1"
和"outcome2"
进行计数。 (我敢肯定这可以直接在熊猫中完成,但是我有点熊猫新手):
dict = {}
for id in 'a1', 'a2':
dict[id] = {}
for approach in 'approach1', 'approach2', 'approach3':
dict[id][approach] = {}
for outcome in 'outcome1', 'outcome2':
dict[id][approach][outcome] = ((df['id'] == id)
& (df['approach'] == approach)
& (df['outcome'] == outcome)).sum()
plot_data = pd.DataFrame(dict)
现在剩下要做的就是绘图了。
fig, ax = plt.subplots(1, 1)
i = 0
for id in 'a1', 'a2':
for approach in 'approach1', 'approach2', 'approach3':
ax.bar(i, plot_data[id][approach]["outcome1"], color='g')
ax.bar(i, plot_data[id][approach]["outcome2"],
bottom=plot_data[id][approach]["outcome1"], color='r')
i += 1
i+=1
ax.set_xticklabels(['', 'approach1', 'approach2', 'approach3', '',
'approach1', 'approach2', 'approach3'], rotation=45)
custom_lines = [Line2D([0], [0], color='g', lw=4),
Line2D([0], [0], color='r', lw=4)]
ax.legend(custom_lines, ['Outcome 1', 'Outcome 2'])