如何对2D分类数据进行网格绘制

时间:2019-07-30 09:10:15

标签: python matplotlib plot categorical-data

我的数据如下:

Name X    Y
A    HIGH MID
B    LOW  LOW
C    MID  LOW
D    HIGH MID

如何在带有3x3网格的2D图中绘制此数据,并添加一个随机变化,以将每个数据点(包括其名称)彼此之间留有足够的间距。

因此它看起来应该像这样: enter image description here

我尝试了以下方法,但是我不知道如何将值不精确地绘制在网格上,而是在两者之间绘制,因此它们不会重叠。

import pandas as pd
import matplotlib.pyplot as plt

### Mock Data ###
data = """A0,LOW,LOW
A,MID,MID
B,LOW,MID
C,MID,HIGH
D,LOW,MID
E,HIGH,HIGH"""

df = pd.DataFrame([x.split(',') for x in data.split('\n')])
df.columns = ['name','X','Y']

### Plotting ###
fig,axs = plt.subplots()
axs.scatter(df.X,df.Y,label=df.name)
axs.set_xlabel('X')
axs.set_ylabel('Y')
for i,p in enumerate(df.name):
    axs.annotate(p, (df.X[i],df.Y[i]))
axs.grid()
axs.set_axisbelow(True)
fig.tight_layout()
plt.show()

结果: enter image description here

1 个答案:

答案 0 :(得分:1)

您可以直接控制位置并更改轴上的标签。绘图存在一些问题,因为您没有考虑某些问题,例如“如果在同一位置有多个点,将会得到什么标签?”。

在任何情况下,这都是可能的解决方案:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

### Mock Data ###
data = """A0,LOW,LOW
A,MID,MID
B,LOW,MID
C,MID,HIGH
D,LOW,MID
E,HIGH,HIGH"""

df = pd.DataFrame([x.split(',') for x in data.split('\n')])
df.columns = ['name','X','Y']

pos = [0, 1, 2]
lbls = ["LOW", "MID", "HIGH"]
trans = {lbls[i]:pos[i] for i in range(len(pos))}

mat = np.zeros((3, 3), dtype="U10") # This is limited to 10 characters
xxs = []
yys = []
offset = 0.05

for i in range(df.shape[0]):
    xc, yc = trans[df.X[i]], trans[df.Y[i]]
    if mat[xc, yc]=="":
        mat[xc, yc] = df.name[i]
    else:
        mat[xc, yc] = mat[xc, yc] + ";" + df.name[i]
    xxs.append(xc)
    yys.append(yc)
fig,axs = plt.subplots()
axs.scatter(xxs, yys)
for i in range(df.shape[0]):
    name = mat[xxs[i], yys[i]]
    axs.text(xxs[i]+offset, yys[i]+offset, name)
axs.set_xticks(pos)
axs.set_xticklabels(lbls)
axs.set_yticks(pos)
axs.set_yticklabels(lbls)
for i in pos:
    axs.axhline(pos[i]-0.5, color="black")
    axs.axvline(pos[i]-0.5, color="black")
axs.set_xlim(-0.5, 2.5)
axs.set_ylim(-0.5, 2.5)
plt.show()

这将导致以下图像:

customized categorical scatter plot