Question

我有一个提取的数据集，然后对其进行了一些转换。我删除了一些不相关的列（即某些属性），并将一些数据转换为对数刻度。

filename ='./Data/forestfires.csv'
df = pd.read_csv(filename)
raw_data = df.get_values()
cols = [0, 1, 8, 9, 10, 11, 12]
tmp = raw_data[:, cols] #creating a temporary matrix only to change the data
area = np.array(tmp[:, -1], dtype=int).T #selecting one column
area_log = np.log10([area+1]) 
tmp[:,-1] = area_log
X = tmp
attributeNames = np.asarray(df.columns[cols])

这是我到目前为止所做的。

我试图绘制散点图：

plt.title('Forest fire - X coordinate and area scatter plot')
plt.scatter(x=X[:, 0],
            y=X[:, -1], 
            s=50, alpha=0.5)
plt.xlabel(attributeNames[0])
plt.ylabel(attributeNames[-1])
plt.show()

问题在于，以这种方式，我仅获得一个散点图，使我能够将第0列与第-1列进行比较。我想一次对X中包含的所有列执行此过程。

直方图存在相同问题：

X_area = X[:, -1]

plt.hist(X_area, color='darkred', edgecolor='black')

plt.ylabel('Frequency')

plt.xlabel('Burnt area (hectares)')

在这里，我想对所有属性使用y轴上的频率和x轴上的属性制作直方图。

您可以在以下链接中查看数据集： https://www.kaggle.com/elikplim/forest-fires-data-set

非常感谢您的帮助！

一次生成多个图（直方图和散点图），而不是从“ df”生成

0 个答案: