Question

我有一个像这样的数据集（列号和行号可以变化，这就是为什么我需要定义绘图函数的原因）。

import pandas as pd
import numpy as np
plot_df = pd.DataFrame({
  'decl': [0.000000, 0.000000, 0.000000, 0.000667, 0.000833, 0.000833, 0.000000],
  'dk': [0.003333, 0.000000, 0.000000, 0.001333, 0.001667, 0.000000, 0.000000],
  'yes': [0.769167, 0.843333, 0.762000, 0.666000, 0.721667, 0.721667, 0.775833],
  'no': [0.227500, 0.156667, 0.238000, 0.332000, 0.275833, 0.277500, 0.224167]})

对于此数据，我想创建一个类似于用此代码为静态数字创建的绘图：

# configure plot
N = len(plot_df) # number of groups
num_y_cats = len(plot_df.columns) # number of y-categories (responses)
ind = np.arange(N) # x locations for the groups
width = 0.35 # width of bars

p1 = plt.bar(ind, plot_df.iloc[:,0], width)
p2 = plt.bar(ind, plot_df.iloc[:,1], width)
p3 = plt.bar(ind, plot_df.iloc[:,2], width)
p4 = plt.bar(ind, plot_df.iloc[:,3], width)

plt.ylabel('[%]')
plt.title('Responses by country')

x_ticks_names = tuple([item for item in plot_df.index])

plt.xticks(ind, x_ticks_names)
plt.yticks(np.arange(0, 1.1, 0.1)) # ticks from, to, steps
plt.legend((p1[0], p2[0], p3[0], p4[0]), ('decl', 'dk', 'yes', 'no'))
plt.show()

这给了我以下plot，这带来了两个我无法克服并寻求帮助的问题：

这些数字的总和不等于1.0-尽管它们应该相加，因为我创建了具有规范化（df）的原始plot_df['sum'] = plot_df['decl'] + plot_df['dk'] + plot_df['yes'] + plot_df['no']。

另一个问题是我想定义一个函数，该函数为具有可变行数和列数的df s创建相同的图，但是卡在创建不同图的部分上。到目前为止，我有：

def bar_plot(plot_df):
''' input: data frame where rows are groups; columns are plot components to be stacked '''

# configure plot
N = len(plot_df) # number of groups
num_y_cats = len(plot_df.columns) # number of y-categories (responses)
ind = np.arange(N) # x locations for the groups
width = 0.35 # width of bars

for i in range(num_y_cats): # for every response in the number of responses, e.g. 'Yes', 'No' etc.
    p = plt.bar(ind, plot_df.iloc[:,i], width) # plot containing the response

plt.ylabel('[%]')
plt.title('Responses by group')

x_ticks_names = tuple([item for item in plot_df.index]) # create a tuple containing all [country] names

plt.xticks(ind, x_ticks_names)
plt.yticks(np.arange(0, 1.1, 0.1)) # ticks from, to, steps
plt.show()

但是，这里的问题是循环没有正确添加不同的图层，我无法弄清楚该怎么做。有人可以给我指点吗？

Answer 1

问题编号1（如果我正确理解的话）是条形的高度不为1（即所有分数的总和）。您的代码

p1 = plt.bar(ind, plot_df.iloc[:,0], width)
p2 = plt.bar(ind, plot_df.iloc[:,1], width)
...

创建四个条形图， all 从 0 开始（在y轴上）。我们想要的是让p2从p1开始，p3从p2开始，依此类推。为此，我们可以在bottom中指定plt.bar参数（默认为0）。所以，

p1 = plt.bar(ind, plot_df.iloc[:,0], width) p2 = plt.bar(ind, plot_df.iloc[:,1], width, bottom=plot_df.iloc[:,0]) ...

对于p3，我们希望bottom从plot_df.iloc[:,0]和plot_df.iloc[:,1]之和开始。我们可以显式地执行此操作，也可以使用np.sum来执行此操作。后者当然具有我们可以对任意数量的列求和的优点（就像您希望在函数中使用的那样）。

关于您的功能...我试了一下。您可能必须自己完善它

np.sum(plot_df.iloc[:,:i]

Answer 2

@mortysporty提供的功能（相应地，所有功劳都可以得到调整），只需在开头添加几行，以稍后完成引用，即可完成所需的任务：

import matplotlib.pyplot as plt
import numpy as np

def newest_bar_plot(plot_df):
    N = len(plot_df) # number of groups
    ind = np.arange(N) # x locations for the groups
    width = 0.35 # width of bars

    p_s = []
    p_s.append(plt.bar(ind, plot_df.iloc[:,0], width))
    for i in range(1,len(plot_df.columns)):
        p_s.append(plt.bar(ind, plot_df.iloc[:,i], width,
                           bottom=np.sum(plot_df.iloc[:,:i], axis=1)))

    plt.ylabel('[%]')
    plt.title('Responses by country')

    x_ticks_names = tuple([item for item in plot_df.index])

    plt.xticks(ind, x_ticks_names)
    plt.yticks(np.arange(0, 1.1, 0.1)) # ticks from, to, steps
    plt.legend(p_s, plot_df.columns)
    plt.show()

循环中的堆叠条形图，不添加条的不同成分

2 个答案: