下面的代码示例可以很好地生成一堆图表。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import datasets
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
# sklearn provides the iris species as integer values since this is required for classification
# here we're just adding a column with the species names to the dataframe for visualisation
df['species'] = np.array([iris.target_names[i] for i in iris.target])
sns.pairplot(df, hue='species')
下面三行代码可以正确地处理数据框中的所有数据。
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = np.array([iris.target_names[i] for i in iris.target])
sns.pairplot(df, hue='species')
list(df)
['sepal length (cm)',
'sepal width (cm)',
'petal length (cm)',
'petal width (cm)']
现在,我正在将自己的数据加载到数据框中,并尝试这样做,就像这样。
df = pd.read_csv('C:\\path_to_data\\test.csv')
df1 = df[df['officearea']!=0]
df1.shape
list(df1)
df1 = pd.DataFrame(df1.data, columns=df1.feature_names)
list(df1)
这时,使用我的数据,我遇到以下错误。
AttributeError: 'DataFrame' object has no attribute 'data'
在我的数据集上,当我运行以下行时:
list(df1)
我看到了:
['index',
'zone',
'lot',
etc., etc., etc.,
'address',
'sensor',
'map']