让我们说我有两个数据集,然后以一定的权重绘制两个数据集的堆叠直方图。现在,我是否可以知道大于一定数量的数据元素的总bin计数是多少(即大于特定值的x坐标)。为了说明我的问题,我已经完成了以下
$("#sub-accordion-panel-frontpage_panel").sortable({
items: "> .control-section-kirki-default",
axis : "y",
cursor: "move",
update: function(){
$(this).trigger("change");
}
});
现在,我如何知道垃圾箱计数,例如import matplotlib.pyplot as plt
import numpy as np
data1 = np.random.normal(0,0.6,1000)
data2 = np.random.normal(0,1.4,1000)
weight1 = np.array([0.5]*len(data1))
weight2 = np.array([0.9]*len(data2))
hist = plt.hist((data1,data2),weights=(weight1,weight2),stacked=True,range=(-5,5))
plt.show()
大于-2?
到目前为止,为了得到答案,我正在做以下事情
x
在这里,我将范围内的最大值选择为一个非常大的数字,以便获得n1,_,_ = plt.hist((data1,data2),weights=(weight1,weight2),stacked=False,range=(-2,10000))
bin_counts=sum(sum(n1))
print(bin_counts)
及更高的所有bin计数。
有没有比这更有效的方法了?
此外,为变量x=-2
获取bin_counts
的方式是什么,其中x
从x坐标的最小值到x的最大值一些步骤?
任何帮助将不胜感激!
非常感谢!
答案 0 :(得分:0)
您可以执行以下操作:
#in your case n is going to be a list of arrays, because you have 2 histograms
n,bins,_ = plt.hist(...)
#get a list of lists of counts for bin values over x
n_over_x = [[val for val,bin in zip(selected_cnt, bins) if bin > x] for selected_cnt in n]
#sum up list of lists
result = sum([sum(part_list) for part_list in n_over_x])
答案 1 :(得分:0)
这是我想出的,
def my_range(start, end, step):
while start <= end:
yield start
start += step
b_counts=[0]*len(data1) #here b_counts is the normalized events (i mean normalized according to the weights)
value=[0]*len(data1)
bin_min=-5
bin_max=10
bin_step=1
count_max = (bin_max-bin_min)/bin_step
for i in my_range(bin_min,count_max,1):
n1,_,_ = plt.hist((data1,data2),weights=(weight1,weight2),stacked=False,range=(i*bin_step,10000))
b_counts[i] = sum(sum(n1))
value[i] = i*bin_step #here value is exactly equal to "i", but I am writing this for a general case
print(b_counts[i],value[I])
我确实相信这会给我(直方图)在(值,10000)范围内的事件,其中值是变量