Question

给出以下行：

plt.scatter(X[:, 0], X[:, 1], s=50);

X[:, 0], X[:, 1]是什么意思？在我浏览过的所有示例中，我只看到X,y。

我也不了解X, y =的目的。

下面是X的输出，其中包括X和y的值。但是y本身有不同的输出，我不知道在哪里使用它/为什么？

array([[ 1.85219907,  1.10411295],
       [-1.27582283,  7.76448722],
       [ 1.0060939 ,  4.43642592],
       [-1.20998253,  7.83203579],
       [ 1.92461484,  1.06347673],
       [ 2.28565919,  0.79166208],
       [-1.57379043,  2.69773813],
       [ 1.04917913,  4.31668562],
       [-1.07436851,  7.93489945],
       [-1.15872975,  7.97295642]

下面的完整脚本：

#import the required libraries
# - matplotlib is a charting library
# - Seaborn builds on top of Matplotlib and introduces additional plot types. It also makes your traditional Matplotlib plots look a bit prettier.
# - Numpy is numerical Python
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.datasets.samples_generator import make_blobs
from sklearn.cluster import KMeans
#Generate sample data, with distinct clusters for testing
#n_samples = the number of datapoints, equally split across each clusters
#centers = The number of centers to generate (number of clusters) - a center is the arithmetic mean of all the points belonging to the cluster.
#cluster_std = the standard deviation of the clusters - a quantity expressing by how much the members of a group differ from the mean value for the group (how tight is the cluster going to be)
#random_state = controls the random number generator being used.  If you don't mention the random_state in the code, then whenever you execute your code a new random value is generated and the train and test datasets would have different values each time. However, if you use a particular value for random_state(random_state = 1 or any other value) everytime the result will be same,i.e, same values in train and test datasets.
X, y = make_blobs(n_samples=300, centers=4,
                       cluster_std=0.50, random_state=0)
#The below statement, will enable us to visualise matplotlib charts, even in ipython
#Using matplotlib backend: MacOSX
#Populating the interactive namespace from numpy and matplotlib
%pylab
#plot the chart
#s = the sizer of the points.
plt.scatter(X[:, 0], X[:, 1], s=50);

Answer 1

make_blobs生成“各向同性的高斯斑点”-X是一个具有两列的numpy数组，其中包含这些点的（x，y）高斯坐标，而y包含每个点的类别列表。

In[1]:  X.shape
Out[1]: (300, 2)

X [:, 0]是选择列0的每个行条目的numpy坐标方式-即从numpy数组中的单个列。

如果绘制坐标簇，则可以更轻松地看到它们。您的代码似乎丢失了

plt.show()

将显示绘图。 make_blob plot

如果针对y绘制这些列之一，则可以更清楚地看到它们是根据其坐标进行分类的，但这本身并不是特别有用的图。 X[:, 0] plotted against y

Answer 2

X是2D numpy数组。 X[:,0]正在访问第一列中的所有内容，而X[:,1]正在访问第二列中的所有内容。

对于您的plt.scatter语句，图表的“ x”和“ y”均来自X。

X, y =仅表示make_blobs()的输出具有两个元素，分别分配给X和y。由于分配给变量的名称，散点图中与“ x”和“ y”的关联有些混乱。 “ x”和“ y”可以是任何变量，或者（在这种情况下）可以与单个2D numpy数组分开索引。

matplotlib散点图中的X [：，0]

2 个答案: