Question

我有两个numpy数组，x和y，每个都有7000个元素。我想制作一个散点图，根据这些条件给每个点一个不同的颜色：

-BLACK if x[i]<10.

-RED if x[i]>=10 and y[i]<=-0.5

-BLUE if x[i]>=10 and y[i]>-0.5

我尝试创建一个与我想要为每个点分配颜色的数据长度相同的列表，然后用循环绘制数据，但运行它需要很长时间。这是我的代码：

import numpy as np
import matplotlib.pyplot as plt

#color list with same length as the data
col=[]
for i in range(0,len(x)):
    if x[i]<10:
        col.append('k') 
    elif x[i]>=10 and y[i]<=-0.5:
        col.append('r') 
    else:
        col.append('b') 

#scatter plot
for i in range(len(x)):
    plt.scatter(x[i],y[i],c=col[i],s=5, linewidth=0)

#add horizontal line and invert y-axis
plt.gca().invert_yaxis()
plt.axhline(y=-0.5,linewidth=2,c='k')

在此之前，我尝试以相同的方式创建相同的颜色列表，但绘制没有循环的数据：

#scatter plot
plt.scatter(x,y,c=col,s=5, linewidth=0)

即使这样绘制数据的速度远远快于使用for循环，但某些散乱点的颜色显示错误。为什么不使用循环绘制数据导致某些点的颜色不正确？

我还尝试定义三组数据，每种颜色一组，并分别将它们添加到绘图中。但这不是我想要的。

有没有办法在散点图参数中指定我想为每个点使用的颜色列表，以便不使用for循环？

PS：这是我没有使用for循环（错误的循环）时得到的情节：

当我使用for循环（正确）时这个：

Answer 1

可以使用Telegram API documentation完成此操作。由于我没有确切的x和y值，我将不得不使用一些假数据：

import numpy as np
import matplotlib.pyplot as plt

#generate some fake data
x = np.random.random(10000)*10
y = np.random.random(10000)*10

col = np.where(x<1,'k',np.where(y<5,'b','r'))

plt.scatter(x, y, c=col, s=5, linewidth=0)
plt.show()

这会产生以下图：

numpy.where

行col = np.where(x<1,'k',np.where(y<5,'b','r'))是重要的一行。这会生成一个列表，大小与x和y相同。它使用'k','b'或'r'填充此列表，具体取决于之前写入的条件。因此，如果x小于1，'k'将附加到列表中，否则如果y小于5 'b'将被追加，如果这些条件都不满足，'r'将是附加到列表中。这样，您就不必使用循环来绘制图形。

对于您的特定数据，您必须更改np.where条件中的值。

Python Matplotlib散点图：根据条件

1 个答案: