Question

基本上，我希望所有条形重叠，但我不希望它们堆叠也不是并排。我希望它们重叠，但如果我尝试与pyplot重叠条形图，它不会自动组织它，以便较小的条形图在前面而较大的条形图在后面。一些酒吧完全隐藏。我不想使用alpha属性，因为颜色太多，合并时很容易混淆。这是我的代码：

import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("flow_actions.csv", index_col="weekday")

def pandas_chart(df, **kwargs):
    df.plot.barh(**kwargs)
    plt.grid(axis="x")
    plt.legend()
    plt.show()

def pyplot_chart(df, **kwargs):
    for col in df:
        plt.barh(y=df.index.values, 
                 width=df[col].values,
                 label=col,
                 height=0.8)
    plt.legend()
    plt.grid(axis="x")
    plt.show()

这是我正在使用的数据集：

+---------+--------------+--------------+----------+---------+--------+
| weekday | E-mail(auto) | E-mail(semi) | LinkedIn | Ligação | Social |
+---------+--------------+--------------+----------+---------+--------+
| Mon     | 0.15         | 0.02         | 0.04     | 0.08    | 0      |
| Tue     | 0.1          | 0.03         | 0.03     | 0.05    | 0.01   |
| Wed     | 0.12         | 0.02         | 0.05     | 0.07    | 0.02   |
| Thu     | 0.13         | 0.02         | 0.04     | 0.06    | 0.01   |
| Fri     | 0.15         | 0.04         | 0.04     | 0.05    | 0.02   |
| Sat     | 0.15         | 0.01         | 0.03     | 0.08    | 0      |
| Sun     | 0.16         | 0.01         | 0.02     | 0.06    | 0.01   |
+---------+--------------+--------------+----------+---------+--------+

以下是一些（不需要的）输出：

>>> pandas_chart(df)

输出：

>>> pandas_chart(df, stacked=True)

输出：

>>> pyplot_chart(df)

输出：

问题是，我想要图像＃3和＃2之间的东西，但我不希望像＃2那样堆叠的值，也不希望它们像3中的其他条形图一样隐藏。这样的事情是可能的，还是做的我必须坚持使用＃1（你的类别越多，看起来越丑）？

Answer 1

我知道你想要像＃3这样的东西。如果某些值在同一行中相似，则可能会导致问题。但除此之外，您可以创建自己的列排序，以防止较大的值覆盖较小的值。

import matplotlib.pyplot as plt
import pandas as pd
from matplotlib import cm
from itertools import cycle

df = pd.read_csv("test.csv", index_col = "weekday")

def pyplot_chart(df):

    #create dictionary for colors by cycling through a predefined colour list
    color_cycle = cycle([ 'k', 'b', 'r', 'y', 'c', 'm', 'g'])
    col_dic = {col: next(color_cycle) for col in df}
    #alternatively, extract colours along a defined colormap
    #see color maps reference https://matplotlib.org/examples/color/colormaps_reference.html
    #col_dic = {col: cm.tab20c(1 - i / len(df.columns)) for i, col in enumerate(df)}

    #cycle through each row of the dataframe
    for yvalue, row in df.iterrows():
        #sort the values within the row, plot largest values first
        for index, value in row.sort_values(ascending = False).iteritems():
            plt.barh(y=yvalue,
                     width=value,
                     color=col_dic[index],
                     height=0.8)

    #plot invisible columns for labels
    for col in df.columns:
        plt.barh(y=df.index,
                 width=0,
                 color=col_dic[col],
                 label = col,
                 height=0)

    plt.legend()
    plt.grid(axis="x")
    plt.show()

pyplot_chart(df)

输出：

正如您所看到的，在周二，两个值为0.3，您无法区分LinkedIn是否存在。您可以尝试通过修改width参数来克服此问题，即较小的值也具有较小的宽度以显示它们后面的类似值。

如何使用matplotlib / pandas绘制非堆叠和非并排的水平条形图？

1 个答案: