如何更新pyplot直方图

时间:2015-07-19 22:28:17

标签: python matplotlib

我有一个100.000.000样本数据集,我想用pyplot制作直方图。但阅读这个大文件会严重耗尽我的记忆(光标不再移动,......),所以我正在寻找“帮助”pyplot.hist的方法。我在想将文件拆分成几个较小的文件可能有所帮助。但我不知道如何将它们组合起来。

1 个答案:

答案 0 :(得分:2)

您可以将pyplot.hist的输出组合起来,或者@titusjan建议numpy.histogram,只要您每次调用它时都保持垃圾箱的固定。例如:

import matplotlib.pyplot as plt
import numpy as np

# Generate some fake data
data=np.random.rand(1000)

# The fixed bins (change depending on your data)
bins=np.arange(0,1.1,0.1)

sub_hist = [], []
# Split into 10 sub histograms
for i in np.arange(0,1000,10):
    sub_hist_temp, bins_out = np.histogram(data[i:i+10],bins=bins)
    sub_hist.append(sub_hist_temp)

# Sum the histograms
hist_sum = np.array(sub_hist).sum(axis=0)

# Plot the new summed data, using plt.bar
fig=plt.figure()
ax1=fig.add_subplot(211)
ax1.bar(bins[:-1],hist_sum,width=0.1) # Change width depending on your bins

# Plot the histogram of all data to check
ax2=fig.add_subplot(212)
hist_all, bins_out, patches = all=ax2.hist(data,bins=bins)

fig.savefig('histsplit.png')

enter image description here