Iris分类python使用numpy操作

时间:2014-11-29 05:54:42

标签: python machine-learning scikit-learn

我正在关注这本书Building Machine Learning Systems with Python。从scipy加载数据集后,我需要提取属于setosa的所有特征的索引。但我无法提取。可能是因为我没有使用numpy数组。有人可以帮我提取索引号吗?代码

from matplotlib import pyplot as plt
from sklearn.datasets import load_iris

import numpy as np
# We load the data with load_iris from sklearn

data = load_iris()

features = data['data']

feature_names = data['feature_names']

target = data['target']
for t,marker,c in zip(xrange(3),">ox","rgb"):

# We plot each class on its own to get different colored markers
plt.scatter(features[target == t,0], features[target == t,1],
            marker=marker, c=c)

plength = features[:, 2]

# use numpy operations to get setosa features

is_setosa = (labels == 'setosa')

# This is the important step:

max_setosa = plength[is_setosa].max()

min_non_setosa = plength[~is_setosa].min()

print('Maximum of setosa: {0}.'.format(max_setosa))

print('Minimum of others: {0}.'.format(min_non_setosa))

1 个答案:

答案 0 :(得分:0)

在问题行之前定义标签。

target_names = data['target_names']
labels = target_names[target]

现在这些行可以正常工作:

is_setosa = (labels == 'setosa')
setosa_petal_length = plength[is_setosa].

附加。 来自sklearn的数据束(data = load_iris())由数字0-2的目标数组组成,这些数组与特征相关并且意味着花的种类。使用它可以提取属于setosa的所有特征(目标等于0),如:

petal_length = features[:, 2]
setosa_petal_length = petal_length[target == 0]

使用数据[' target_names']面对此问题,您将在顶部找到两行代码来解决您的问题。顺便说一句,来自数据的所有数组都来自NumPy。