Question

所以我的代码绘制了我的数据集的二维图。我把它绘制成如下：

histogram = plt.hist2d(fehsc, ofesc, bins=nbins, range=[[-1,.5],[0.225,0.4]])

我想只查看某条线以上的数据，所以我添加了以下内容并且它运行得很好：

counts = histogram[0]
xpos = histogram[1]
ypos = histogram[2]
image = histogram[3]
newcounts = counts #we're going to iterate over this

for i in range (nbins):
    xin = xpos[i]
    yin = ypos
    yline = m*xin + b
    reset = np.where(yin < yline) #anything less than yline we want to be 0
    #index = index[0:len(index)-1]  
    countout = counts[i]
    countout[reset] = 0
    newcounts[i] = countout

但是，我现在需要在该切割区域绘制回归线。在plt.2dhist中这样做是不可能的（AFAIK），所以我使用plt.scatter。问题是我不知道如何进行切割 - 我无法对散点图进行索引。

我现在有这个：

plt.xlim(-1,.5)
plt.ylim(.225, .4)

scatter = plt.scatter(fehsc,ofesc, marker = ".")

我只想将数据保留在某行之上：

xarr = np.arange(-1,0.5, 0.015)
yarr = m*xarr + b
plt.plot(xarr, yarr, color='r')

我尝试使用变量的某些变体来运行循环，但我实际上并不了解或知道如何使其发挥作用。

Answer 1

您可以在绘制之前为数据定义mask，然后只绘制实际符合条件的数据点。下面的示例中，某条线以上的所有数据点都以绿色绘制，而线下方的所有数据点都以黑色绘制。

from matplotlib import pyplot as plt
import numpy as np

#the scatterplot data
xvals = np.random.rand(100)
yvals = np.random.rand(100)

#the line
b  = 0.1
m = 1
x = np.linspace(0,1,num=100)
y = m*x+b

mask = yvals > m*xvals+b

plt.scatter(xvals[mask],yvals[mask],color='g')
plt.scatter(xvals[~mask],yvals[~mask],color='k')
plt.plot(x,y,'r')
plt.show()

结果如下

希望这有帮助。

修改：

如果要创建2D直方图，其中线下方的部分设置为零，您可以先使用numpy（作为数组）生成直方图，然后在其中设置值如果分档低于该行，则数组为零。之后，您可以使用plt.pcolormesh：
绘制矩阵
from matplotlib import pyplot as plt import numpy as np #the scatterplot data xvals = np.random.rand(1000) yvals = np.random.rand(1000) histogram,xbins,ybins = np.histogram2d(xvals,yvals,bins=50) #computing the bin centers from the bin edges: xcenters = 0.5*(xbins[:-1]+xbins[1:]) ycenters = 0.5*(ybins[:-1]+ybins[1:]) #the line b = 0.1 m = 1 x = np.linspace(0,1,num=100) y = m*x+b #hiding the part of the histogram below the line xmesh,ymesh = np.meshgrid(xcenters,ycenters) mask = m*xmesh+b > ymesh histogram[mask] = 0 #making the plot mat = plt.pcolormesh(xcenters,ycenters,histogram) line = plt.plot(x,y,'r') plt.xlim([0,1]) plt.ylim([0,1]) plt.show()

结果如下：

在散点图中删除线下的数据（Python）

1 个答案: