我用它来计算整个直方图下的面积。但是,我找不到资源来告诉如何在值之后或在特定间隔内计算直方图下的面积。有什么想法吗? x是我的数据,值是发生的概率。
area = sum(np.diff(bins)*values)
答案 0 :(得分:0)
我相信np.diff(bins)
是一维的numpy数组,在这种情况下,您可以将其切片为np.diff(bins)[start:end]
,并将np.diff(bins)[start:]
切换为所有值。
答案 1 :(得分:0)
area = sum(np.diff(bins)[0]*values[start:])
np.diff(bins)
帮助您找到x
的沿x轴相同的部分。因此,您可以采用第一个元素。
答案 2 :(得分:0)
使用A.Rauan's方法,以下是直方图和特定值后的面积的可视化视图:
import numpy as np
import matplotlib.pyplot as plt
def find_bin_idx_of_value(bins, value):
"""Finds the bin which the value corresponds to."""
array = np.asarray(value)
idx = np.digitize(array,bins)
return idx-1
def area_after_val(counts, bins, val):
"""Calculates the area of the hist after a certain value"""
left_bin_edge_index = find_bin_idx_of_value(bins, val)
bin_width = np.diff(bins)[0]
area = sum(bin_width * counts[left_bin_edge_index:])
return area
def add_area_line_to_plot(axes, counts, bins, val):
"""Adds a vertical line and labels it with the value and area after that line"""
area = area_after_val(counts, bins, val)
axes.axvline(val, color='r', label=f"val={val:.2f}, Area={area:.2f}")
def main():
num_data_points, loc, scale = 1000, 40, 20
data = np.random.normal(loc, scale,num_data_points)
fig, ax = plt.subplots()
counts, bins, _ = ax.hist(data, bins=20, alpha=0.3, density=True, label="Data")
add_area_line_to_plot(ax, counts, bins, val=min(data))
add_area_line_to_plot(ax, counts, bins, val=np.mean(data))
add_area_line_to_plot(ax, counts, bins, val=np.mean(data)*2)
add_area_line_to_plot(ax, counts, bins, val=np.mean(data)*3)
ax.legend()
plt.show()
if __name__ == "__main__":
main()