仅使用python库,即可寻求帮助以创建类似于此链接中的绘图。
Catagorical Bubble Chart using ggplot2 in R:查看投票最多的回复。
在这里,我从链接中借用了数据:
df = pd.DataFrame({'Var1':['Does.Not.apply',
'Not.specified',
'Active.Learning..general.',
'Problem.based.Learning',
'Project.Method',
'Case.based.Learning',
'Peer.Learning',
'Other',
'Does.Not.apply',
'Not.specified',
'Does.Not.apply',
'Active.Learning..general.',
'Does.Not.apply',
'Problem.based.Learning',
'Does.Not.apply',
'Project.Method',
'Does.Not.apply',
'Case.based.Learning',
'Does.Not.apply',
'Peer.Learning',
'Does.Not.apply',
'Other'],
'Var2':['Does.Not.apply',
'Does.Not.apply',
'Does.Not.apply',
'Does.Not.apply',
'Does.Not.apply',
'Does.Not.apply',
'Does.Not.apply',
'Does.Not.apply',
'Not.specified',
'Not.specified',
'Active.Learning..general.',
'Active.Learning..general.',
'Problem.based.Learning',
'Problem.based.Learning',
'Project.Method',
'Project.Method',
'Case.based.Learning',
'Case.based.Learning',
'Peer.Learning',
'Peer.Learning',
'Other',
'Other'],
'Count' : [53,15,1,2,4,22,6,1,15,15,1,1,2,2,4,4,22,22,6,6,1,1]})
答案 0 :(得分:0)
Plotnine是基于r的ggplot2的图形python实现的语法。
该代码与R链接中的代码几乎相同。
import math
import pandas as pd
from plotnine import *
df = pd.DataFrame(<dataframe data here>)
df['dotsize'] = df.apply(lambda row: math.sqrt(float(row.Count) / math.pi)*7.5, axis=1)
(ggplot(df, aes('Var1', 'Var2')) + \
geom_point(aes(size='dotsize'),fill='white') + \
geom_text(aes(label='Count'),size=8) + \
scale_size_identity() + \
theme(panel_grid_major=element_line(linetype='dashed',color='black'),
axis_text_x=element_text(angle=90,hjust=1,vjust=0))
).save('mygraph.png')
答案 1 :(得分:0)
Python的本机matplotlib
当然可以创建这种图形。这只是具有可变标记大小的分类散点图。使用玩具数据集:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#create markersize column from values to better see the difference
#you probably want to edit this function depending on min, max, and range of values
df["markersize"] = np.square(df.Count) + 10
fig = plt.figure()
#plot categorical scatter plot
plt.scatter(df.Var1, df.Var2, s = df.markersize, edgecolors = "red", c = "white", zorder = 2)
#plot grid behind markers
plt.grid(ls = "--", zorder = 1)
#take care of long labels
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
输出:
关于散点图的标记大小函数的定义,you might want to read this answer.
答案 2 :(得分:0)
解决此问题的另一种方法是plot an annotation在每个分类点处带有值和一个圆圈:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#create padding column from values for circles that are neither too small nor too large
df["padd"] = 2.5 * (df.Count - df.Count.min()) / (df.Count.max() - df.Count.min()) + 0.5
fig = plt.figure()
#prepare the axes for the plot - you can also order your categories at this step
s = plt.scatter(sorted(df.Var1.unique()), sorted(df.Var2.unique(), reverse = True), s = 0)
s.remove
#plot data row-wise as text with circle radius according to Count
for row in df.itertuples():
bbox_props = dict(boxstyle = "circle, pad = {}".format(row.padd), fc = "w", ec = "r", lw = 2)
plt.annotate(str(row.Count), xy = (row.Var1, row.Var2), bbox = bbox_props, ha="center", va="center", zorder = 2, clip_on = True)
#plot grid behind markers
plt.grid(ls = "--", zorder = 1)
#take care of long labels
fig.autofmt_xdate()
plt.tight_layout()
plt.show()
示例输出:
对DavidG表示敬意,他向我展示了in this answer如何防止注释打印在图形外部。