以下是为了显示数据直方图而编写的代码:
from scipy.stats import norm
rdd = sc.parallelize([(0,1), (0,1), (0,2), (1,2), (1,10), (1,20), (3,18), (3,18), (3,18)])
dataframe = sqlContext.createDataFrame(rdd, ["p1", "p2"])
for col in dataframe.columns :
dataframe.toPandas()[col].plot(kind='hist', normed=True)
显示:
如何在for col.....
循环内为每列数据生成新的直方图,而不是将每列重叠在同一数据点上?
答案 0 :(得分:1)
每次都需要给它一个新的数字(或至少是轴):
import matplotlib.pyplot as plt
for col in dataframe.columns:
fig, ax = plt.subplots(1,1)
dataframe.toPandas()[col].plot(kind='hist', normed=True, ax=ax)