我在网络中有许多节点。节点每小时发送一次状态信息,表明它们还活着。所以我有一个节点列表和它们最后一次活着的时间。我想绘制一段时间内活动节点的数量。
节点列表按照它们最后一次存活的时间排序,但是我无法找出计算每个日期有多少节点的好方法。
from datetime import datetime, timedelta
seen = [ n.last_seen for n in c.nodes ] # a list of datetimes
seen.sort()
start = seen[0]
end = seen[-1]
diff = end - start
num_points = 100
step = diff / num_points
num = len( c.nodes )
dates = [ start + i * step for i in range( num_points ) ]
我想要的基本上是
alive = [ len([ s for s in seen if s > date]) for date in dates ]
但那不是很有效率。解决方案应该使用seen
列表已排序的事实,而不是在每个日期的整个列表中循环。
答案 0 :(得分:2)
此生成器仅遍历列表一次:
def get_alive(seen, dates):
c = len(seen)
for date in dates:
for s in seen[-c:]:
if s >= date: # replaced your > for >= as it seems to make more sense
yield c
break
else:
c -= 1
答案 1 :(得分:1)
python bisect module会为您找到正确的索引,您可以扣除之前和之后的项目数。
如果我理解正确,那就是O(日期)* O(log(see))
编辑1
应该可以一次性完成,就像SilentGhost演示一样。但是,itertools.groupby对排序数据工作正常,它应该可以在这里做一些事情,也许是这样的(这可能超过O(n)但可以改进):
import itertools
# numbers are easier to make up now
seen = [-1, 10, 12, 15, 20, 75]
dates = [5, 15, 25, 50, 100]
def finddate(s, dates):
"""Find the first date in @dates larger than s"""
for date in dates:
if s < date:
break
return date
for date, group in itertools.groupby(seen, key=lambda s: finddate(s, dates)):
print date, list(group)
答案 2 :(得分:1)
我使用显式迭代器进一步使用了SilentGhosts生成器解决方案。这是我想到的线性时间解决方案。
def splitter( items, breaks ):
""" assuming `items` and `breaks` are sorted """
c = len( items )
items = iter(items)
item = items.next()
breaks = iter(breaks)
breaker = breaks.next()
while True:
if breaker > item:
for it in items:
c -= 1
if it >= breaker:
item = it
yield c
break
else:# no item left that is > the current breaker
yield 0 # 0 items left for the current breaker
# and 0 items left for all other breaks, since they are > the current
for _ in breaks:
yield 0
break # and done
else:
yield c
for br in breaks:
if br > item:
breaker = br
break
yield c
else:
# there is no break > any item in the list
break