我有一个pandas DataFrame,其中所有值都是英文字母或空字符串“”。我的目标是a)通过将列索引作为X轴并将行索引作为Y轴(散点图)来绘制这些字母,并 b)控制X轴方向上的间距,以使它们之间的间距不大。
我已经能够在所需的坐标处绘制字形(例如,圆形),但无法绘制字母(如出现在DataFrame中的字母)。而且,由于X轴具有0、1、2、3 ...,这些圆也间隔开了。如果X轴提供了简单/更好的解决方案,则也可以将其分类,而不是整数。
import pandas as pd
from bokeh.plotting import figure
from bokeh.io import output_file, show
from bokeh.models import ColumnDataSource,Range1d #FactorRange
output_file("plot_text.html",title="plot_text")
# creating the DataFrame
d = {0:["A","A","A","D","D","C","E"],
1:["B","","B","C","D","E","E"],
2:["","","F","F","G","","H"],
3:["","","","","","H","H"]}
df = pd.DataFrame(d,index=range(800,100,-100))
list1_x = []
list1_y = []
for i in range(len(df.columns)):
for j in range(len(df.index)):
if df.iloc[j,i]=="": # excluding the "" appearance
continue
else:
list1_x.append(df.columns[i])
list1_y.append(df.index[j])
source = ColumnDataSource(data=dict(x = list1_x,y = list1_y))
fig = figure(plot_height=500, plot_width=1100,
tools="pan,xwheel_zoom,reset,save,crosshair,box_zoom",
active_drag='pan',
active_scroll='xwheel_zoom',
x_range=Range1d(-5, 100, bounds="auto"),
y_range=Range1d(-100, 1200, bounds="auto")
)
fig.circle(x= 'x',y='y',color = "blue",size =10,source = source)
show(fig)
如前所述,上面的代码绘制了单个字形(在这种情况下为圆形),这不是我想要的,而且我绝对不知道如何实现对缩放的控制。
答案 0 :(得分:0)
您需要根据要绘制的数据创建一个列表,其长度与columndatasource中其他列的长度相同,然后将其添加到源中。如果只需要字符,则可以禁用圆,否则将同时绘制两个字符,还应考虑使用悬停功能,这是向图表添加交互的一种很好的方法。 我使用LabelSet来映射数据的坐标,也可以像上面提到的那样使用偏移量来调整间距。
import pandas as pd
from bokeh.plotting import figure
from bokeh.io import output_file, show, output_notebook
from bokeh.models import ColumnDataSource,Range1d, LabelSet #FactorRange
import itertools
output_notebook()
output_file("plot_text.html",title="plot_text")
# creating the DataFrame
d = {0:["A","A","A","D","D","C","E"],
1:["B","","B","C","D","E","E"],
2:["","","F","F","G","","H"],
3:["","","","","","H","H"]}
#convert dict to a single list to match the length in your
#source(ColumnDataSource)
d1 = list(d.values())
d_letters = list(itertools.chain(*d1))
#I tried to remove your nan values, but it didnt work, I did it manually but
#here is the code to do so
d2 = [x for x in d_letters if d_letters != ''] # [x for x in d_letters if
#d_letters != 'nan']
d_ = ['A', 'A', 'A', 'D', 'D', 'C', 'E', 'B', 'B', 'C','D', 'E', 'E', 'F',
'F', 'G', 'H', 'H', 'H']
#len(d_) 19 as x and y
df = pd.DataFrame(d,index=range(800,100,-100))
list1_x = []
list1_y = []
for i in range(len(df.columns)):
for j in range(len(df.index)):
if df.iloc[j,i]=="": # excluding the "" appearance
continue
else:
list1_x.append(df.columns[i])
list1_y.append(df.index[j])
source = ColumnDataSource(data=dict(x = list1_x,y = list1_y, x1=d_))
fig = figure(plot_height=500, plot_width=1100,
tools="pan,xwheel_zoom,reset,save,crosshair,box_zoom",
active_drag='pan',
active_scroll='xwheel_zoom',
x_range=Range1d(-5, 100, bounds="auto"),
y_range=Range1d(-100, 1200, bounds="auto")
)
fig.circle(x= 'x',y='y',color = "blue",size =10,source = source)
#create labels
labels = LabelSet(x='x', y='y', text='x1', level='glyph',
x_offset=5, y_offset=5, source=source, render_mode='canvas')
fig.add_layout(labels)
show(fig)