我想为以下带有x轴主题和y轴内容的DataFrame绘制散点图。
In[18]: test=pd.read_excel('test.xlsx')
In[19]: test
Out[19]: topic content
0 A1 a
1 A1 b
2 A2 b
3 A2 c
4 A2 e
5 A3 a
6 A3 c
7 A3 d
8 A4 b
9 A4 c
以下是我当前的情节:
如何以不同顺序对y轴进行排序?例如['b','c','a','d','e'],底部带有'b'?
答案 0 :(得分:0)
如果x轴的顺序不重要,则可以使用熊猫Categorial
和sort_values()
:
df = pd.DataFrame([['A1','a'], ['A1','b'], ['A2','b'], ['A2','c'], ['A2','e'], ['A3','a'], ['A3','c'], ['A3','d'], ['A4','b'], ['A4','c']], columns=['topic','content'])
order = ['b', 'c', 'a', 'd', 'e']
df['content'] = pd.Categorical(df['content'], order)
df.sort_values(by=['content'], inplace=True)
plt.scatter(df['topic'], df['content'])
修改
另一种解决方案是将content
的每个值替换为整数df['content'] = [order.index(x) for x in df['content']]
并设置yticks
:
order = ['b', 'c', 'a', 'd', 'e']
df = pd.DataFrame([['A1','a'], ['A1','b'], ['A2','b'], ['A2','c'], ['A2','e'], ['A3','a'], ['A3','c'], ['A3','d'], ['A4','b'], ['A4','c']], columns=['topic','content'])
df['content'] = [order.index(x) for x in df['content']]
plt.yticks(range(len(order)), order)
plt.scatter(df['topic'], df['content'])