我拿起一本书,试图自学机器学习。我正在对数据进行可视化,以查看是否适合在机器学习中使用。
到目前为止,我的代码:
import pandas as pd
import numpy
import mglearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
iri = load_iris()
xTrain, xTest, yTrain, yTest = train_test_split(iri['data'], iri['target'], random_state=0)
print(xTrain.shape)
iriFrame = pd.DataFrame(xTrain, columns=iri.feature_names)
pd.plotting.scatter_matrix(iri, c=yTrain, figsize=(15, 15), marker='o', hist_kwds={'bins':20}, s=60, alpha=.8, cmap=mglearn.cm3)
#print('Keys: \n{}'.format(iri.keys()))
#print(iri['data'])
#print(iri['feature_names'])
我收到的错误指出:
runfile('/home/jack/Desktop/PythonProjects/code/flowers.py', wdir='/home/jack/Desktop/PythonProjects/code')
(112, 4)
Traceback (most recent call last):
File "<ipython-input-19-b6a377fa4d9d>", line 1, in <module>
runfile('/home/jack/Desktop/PythonProjects/code/flowers.py', wdir='/home/jack/Desktop/PythonProjects/code')
File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
execfile(filename, namespace)
File "/usr/lib/python3/dist-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "/home/jack/Desktop/PythonProjects/code/flowers.py", line 13, in <module>
pd.plotting.scatter_matrix(iri, c=yTrain, figsize=(15, 15), marker='o', hist_kwds={'bins':20}, s=60, alpha=.8, cmap=mglearn.cm3)
File "/usr/local/lib/python3.6/dist-packages/pandas/plotting/_misc.py", line 56, in scatter_matrix
df = frame._get_numeric_data()
File "/usr/local/lib/python3.6/dist-packages/sklearn/utils/__init__.py", line 104, in __getattr__
raise AttributeError(key)
AttributeError: _get_numeric_data
这似乎是与安装软件包的设置有关的错误,但我不确定。任何人都可以就发生的事情提供一些建议吗?
答案 0 :(得分:1)
pd.plotting.scatter_matrix()
期望将DataFrame作为第一个参数,使用iriFrame
代替iri
pd.plotting.scatter_matrix(iriFrame, c=yTrain, figsize=(15, 15), marker='o', hist_kwds={'bins':20}, s=60, alpha=.8, cmap=mglearn.cm3)