我有3列数据,如下所示
clm1 clm2 clm3
|["shared","connect"] 13297 |aaaa|
|["stopped","failed] 25002 |aaaa|
|["success","obtained"] 11189 |aaaa|
|["shared","connect"] 16770 |bbbb|
|["stopped","failed] 81777 |bbbb|
|["success","obtained"] 9555 |bbbb|
我想在python中使用下面的条形图,我能够编写简单的图,但是无法获得可以对clm3和图进行分组的逻辑
答案 0 :(得分:1)
这里的主要问题是matplotlib认为您所有的分类数据“ A”代表相同的类别,因此将其在“ A”的相同位置绘制。我们必须发明一个额外的类别来区分所有那些“ A”值。例如,我们可以使用cumcount()
进行此操作,该数字将所有值“ A”从0编号为n。一个例子是:
from matplotlib import pyplot as plt
import pandas as pd
#create toy dataframe
#this part you should have included in your question
#as a Minimal, Complete, and Verifiable example
np.random.seed(1234)
df = pd.DataFrame({"cat": ["A", "B", "C", "C", "B", "C", "A"], "val": np.random.randint(1, 100, 7)})
#add column for multiple cat values and rearrange dataframe
df["cols"] = df.groupby("cat").cumcount()
df1 = df.pivot(index = "cat", columns = "cols", values = "val")
print(df1)
#plot this table
df1.plot.bar(color = "blue", edgecolor = "white")
plt.legend().set_visible(False)
plt.xticks(rotation = 0)
plt.show()
示例数据框:
cols 0 1 2
cat
A 48.0 16.0 NaN
B 84.0 77.0 NaN
C 39.0 54.0 25.0
样本图:
编辑: 我只是注意到,在您的情况下它甚至更容易,因为尽管您的问题中从未提到过,但您可能希望将其作为“ clm1”类别。因此,您可以简化过程:
from matplotlib import pyplot as plt
import pandas as pd
#create toy dataframe
np.random.seed(1234)
df = pd.DataFrame({"clm1": ["X", "Y", "Z", "X", "Y", "Z"], "clm2": np.random.randint(1, 100, 6), "clm3": ["A", "A", "A", "B", "B", "B"]})
#rearrange dataframe and plot
df.pivot(index = "clm3", columns = "clm1", values = "clm2").plot.bar(edgecolor = "white")
plt.xticks(rotation = 0)
plt.show()
示例输出: