假设我有两个列表:
x1 = [1,2,3,4,5,6,7,8,1,10]
x2 = [2,4,2,1,1,1,1,1,2,1]
此处,列表的每个索引i
都是一个时间点,x2[i]
表示在{{1}时观察到的x1[i]
被观察的次数(频率) }}。另请注意,x1 [0] = 1且x1 [8] = 1,总频率为4(= x2 [0] + x2 [8])。
如何有效地将其转换为直方图?简单的方法如下,但这可能是低效的(创建第三个对象和循环)并且因为我有巨大的数据而会伤害我。
i
答案 0 :(得分:3)
执行此操作的最佳方法是在weights
(doc)上使用np.histogram
kwarg,这也将处理x1
中的任意二进制大小和非整数值
vals, bins = np.histogram(x1, bins=10, weights=x2)
如果您只需要根据整数值进行累积,您可以一次创建直方图:
new_array = np.zeros(x2.shape) # or use a list, but I like numpy and you have it
for ind, w in izip(x1, x2):
# -1 because your events seem to start at 1, not 0
new_array[ind-1] += w
如果您真的想用列表执行此操作,可以使用列表推导
[_x for val, w in zip(x1, x2) for _x in [val]*w]
返回
[1, 1, 2, 2, 2, 2, 3, 3, 4, 5, 6, 7, 8, 1, 1, 10]
作为旁注,值得了解如何有效地手动计算直方图:
from __future__ import division
from itertools import izip
num_new_bins = 5
new_min = 0
new_max = 10
re_binned = np.zeros(num_new_bins)
for v, w in izip(x1, x2):
# figure out what new bin the value should go into
ind = int(num_new_bins * (v - new_min) / new_max)
# make sure the value really falls into the new range
if ind < 0 or ind >= num_new_bins:
# over flow
pass
# add the weighting to the proper bin
re_binned[ind] += w
答案 1 :(得分:1)
看来你的binning有问题.2的计数应该是4。不是吗?这是一个代码。这里我们创建了一个额外的数组,但它只运行一次,也是动态的。希望它有所帮助。
import numpy as np
import matplotlib.pyplot as plt
x1 = [1,2,3,4,5,6,7,8,1,10]
x2 = [2,4,2,1,1,1,1,1,2,1]
#your method
x3 = []
for i in range(10):
for j in range(x2[i]):
x3.append(i)
plt.subplot(1,2,1)
hist, bins = np.histogram(x1,bins = 10)
width = 0.7*(bins[1]-bins[0])
center = (bins[:-1]+bins[1:])/2
plt.bar(center, hist, align = 'center', width = width)
plt.title("Posted Method")
#plt.show()
#New Method
new_array=np.zeros(len(x1))
for count,p in enumerate(x1):
new_array[p-1]+=x2[count]
plt.subplot(1,2,2)
hist, bins = np.histogram(x1,bins = 10)
width = 0.7*(bins[1]-bins[0])
center = (bins[:-1]+bins[1:])/2
plt.bar(center, new_array, align = 'center', width = width)
plt.title("New Method")
plt.show()
这是输出:
答案 2 :(得分:-1)
一种方法是使用x3 = np.repeat(x1,x2)
并使用x3制作直方图。