我正在关注这本书Building Machine Learning Systems with Python
。从scipy
加载数据集后,我需要提取属于setosa的所有特征的索引。但我无法提取。可能是因为我没有使用numpy
数组。有人可以帮我提取索引号吗?代码
from matplotlib import pyplot as plt
from sklearn.datasets import load_iris
import numpy as np
# We load the data with load_iris from sklearn
data = load_iris()
features = data['data']
feature_names = data['feature_names']
target = data['target']
for t,marker,c in zip(xrange(3),">ox","rgb"):
# We plot each class on its own to get different colored markers
plt.scatter(features[target == t,0], features[target == t,1],
marker=marker, c=c)
plength = features[:, 2]
# use numpy operations to get setosa features
is_setosa = (labels == 'setosa')
# This is the important step:
max_setosa = plength[is_setosa].max()
min_non_setosa = plength[~is_setosa].min()
print('Maximum of setosa: {0}.'.format(max_setosa))
print('Minimum of others: {0}.'.format(min_non_setosa))
答案 0 :(得分:0)
在问题行之前定义标签。
target_names = data['target_names']
labels = target_names[target]
现在这些行可以正常工作:
is_setosa = (labels == 'setosa')
setosa_petal_length = plength[is_setosa].
附加。 来自sklearn的数据束(data = load_iris())由数字0-2的目标数组组成,这些数组与特征相关并且意味着花的种类。使用它可以提取属于setosa的所有特征(目标等于0),如:
petal_length = features[:, 2]
setosa_petal_length = petal_length[target == 0]
使用数据[' target_names']面对此问题,您将在顶部找到两行代码来解决您的问题。顺便说一句,来自数据的所有数组都来自NumPy。