Question

我想使用python中的Datashader模块执行类似于pyplot.scatter的操作，分别为每个点指定一个单独的（x，y）RGB \ hex值：

#what i'd like to do, but using Datashader:
import numpy as np
#make sample arrays
n = int(1e+8)
point_array = np.random.normal(0, 1, [n, 2])
color_array = np.random.randint(0, 256, [n, 3])/255  # RGB. I can
#convert between it and hex if needed

#the part I need - make an image similar to plt.scatter, using datashader instead:
import matplotlib.pyplot as plt
fig = plt.figure()
plot = fig.add_subplot(111)

fig.canvas.draw()

plot.scatter(point_array[:, 0], point_array[:, 1], c=color_array)
img = np.fromstring(fig.canvas.tostring_rgb(), dtype=np.uint8, sep='')
img = img.reshape(fig.canvas.get_width_height()[::-1] + (3,))

因此img是RGB numpy数组（或PIL数组，或者可以通过python保存为图像的任何东西）

我尝试过的事情

我研究了datashader.Canvas.points以及它如何处理3维熊猫阵列，我认为我可以将color_key与只有红色，只有绿色和只有蓝色且带有“线性”值的.center-item { grid-column: 1 / -1; }一起使用。插值”，但在标签之间确实起作用（但我并没有真正做到这一点（卡在了大熊猫的一面，因为我几乎只使用numpy来处理所有事情）。

Answer 1

我认为您上面的代码可以简化为：

import numpy as np, pandas as pd, matplotlib.pyplot as plt
%matplotlib inline

np.random.seed(0)
n = int(1e+4)
p = np.random.normal(0, 1, [n, 2])
c = np.random.randint(0, 256, [n, 3])/255.0

plt.scatter(p[:,0], p[:,1], c=c);

如果datashader提供了一种方便的方式来处理RGB值（请随意打开一个要求该值的问题！），那就太好了，但是现在您可以计算出每个点的平均R，G，B值：

import datashader as ds, datashader.transfer_functions as tf

df  = pd.DataFrame.from_dict(dict(x=p[:,0], y=p[:,1], r=c[:,0], g=c[:,1], b=c[:,2]))
cvs = ds.Canvas(plot_width=70, plot_height=40)
a   = cvs.points(df,'x','y', ds.summary(r=ds.mean('r'),g=ds.mean('g'),b=ds.mean('b')))

结果将是一个包含r，g，b通道的Xarray数据集，每个通道的范围为0到1.0。然后，您可以根据自己的喜好将这些频道合并为图片，例如使用HoloViews：

import holoviews as hv
hv.extension('bokeh')

hv.RGB(np.dstack([a.r.values, a.g.values, a.b.values])).options(width=450, invert_yaxis=True)

请注意，Datashader当前仅支持无限小点，而不支持Matplotlib示例中的磁盘/实心圆，这就是为什么我使用如此小的分辨率（使点可见以便进行比较）的原因。扩展Datashader以渲染具有非零范围的形状会很有用，但是它不在当前路线图上。

数据着色器：使用手动RGB颜色进行绘图

1 个答案: