Question

我用以下代码在统计图上显示了数据：角度=数据[列[3]]

num_bins = 23
avg_samples_per_bin = 200

# len(data['steering'])/num_bins
hist, bins = np.histogram(data['steering'], num_bins)
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) * 0.5
plt.bar(center, hist, align='center', width=width)
plt.plot((np.min(angles), np.max(angles)), (avg_samples_per_bin, avg_samples_per_bin), 'k-')

显示以下内容：

我正在寻找将删除行上方所有数据的函数。换句话说，每个bin中的数据不能超过200。

是否有一种整齐的方法？

Answer 1

您可以屏蔽数组，选择低于特定阈值的值。例如：

import numpy as np
import matplotlib.pyplot as plt

fig, (ax1, ax2) = plt.subplots(1,2)
ax1.set_title("Some data")
ax2.set_title("Masked data < 80")

np.random.seed(10)
data = np.random.randn(1000)

num_bins = 23
avg_samples_per_bin = 200

hist, bins = np.histogram(data, num_bins)
width = 0.7 * (bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) * 0.5
ax1.bar(center, hist, align='center', width=width)

threshold = 80
mask = hist < threshold

new_center = center[mask]
new_hist = hist[mask]

ax2.bar(new_center, new_hist, align="center", width=width)

plt.show()

哪个给：

删除直方图中阈值以上的数据

1 个答案: