Question

我列出了事件发生的时间。我想用Python将它转换为带有分箱时间的列表（例如0 - 2秒，2 - 4秒等）以及列出这些时间之间发生了多少事件的列表。

例如，如果我有以下时间发生事件：

event_times = [1,2,4,7,8,9]

以下时间数组：

time = [0,2,4,6,8,10]

我期待以下输出：

count = [0,2,1,0,2,1]

告诉我0到2秒之间有2个事件，2到4秒之间有1个事件，等等（包括上限）。第一个零是多余的，因为它总是为零。

现在我用两个for循环解决了这个问题，这个问题很有效，但速度非常慢：

count = numpy.zeros(len(time))
for i in range(1,len(time)):
    for j in range(len(event_times)):
        if event_times[j] > time[i-1] and event_times[j] <= time[i]:
            count[i] = count[i] + 1

Answer 1

要精确获得预期输出（和pythonic-ly），请使用np.digitize和np.bincount

count = np.bincount(np.digitize(event_times, time, right = 1))

count
Out[619]: array([0, 2, 1, 0, 2, 1], dtype=int32)

np.histogram因为使用了左边限制而无法工作，除非你捏造价值观。

Answer 2

您可以使用numpy的histogram函数完成此操作。它接受要计数的列表或数组，并且您可以传入bin分区。它默认包括bin范围的下限，而不是upper。但这很容易解决。

import numpy as np
event_times = [1,2,4,7,8,9]
time = [0,2,4,6,8,10]

counts, bins = np.histogram(event_times, bins=time)
counts

#returns:
array([1, 1, 1, 1, 2])

要包含上限，您只需向time添加一个小偏移。

counts, bins = np.histogram(event_times, bins=np.array(time)+1e-10)
counts

#returns:
array([2, 1, 0, 2, 1])

Answer 3

Daniel F的回答可能就是你想要的，因为你已经在使用numpy了。如果由于某种原因你自己想要实现它并且时间数组代表一个连续的跨度，看起来你的if语句判断，你可以将运行时间减少到O(n) （对于n = len(event_times)）：

count = numpy.zeros(len(time))
for j in event_times:
    count[int(numpy.ceil(j / 2))] += 1

如果time数组不从0开始，只需减去适当的偏移量。

在Python中及时转换为分箱事件

3 个答案: