Question

我有一个这样的清单：

[(ip1, video1, 12345.00000),(ip1, video1, 12346.12362),(ip1, video1, 12347.12684),(ip1, video2,12367.12567),(ip2, video1, 14899.93736), (ip2,video1, 24566.12345).....]

它记录视频ID和每个用户请求视频的时间。

现在我想查看列表，并计算每个视频的第一个和最后一个请求之间的时间间隔，我的列表已经按ip地址排序。

我想得到的结果是：

ip1, video1, 2.12684
ip1, video2, 0

0表示请求永远不会重复。

有人可以帮忙吗？

以下是我创建字典的代码：

for line in fd_in.readlines():
    (time, addr, iptp, userag, usertp, hash, vlanid) = line.split()

    if addr not in client_dict:
        client_dict[addr] = {}

    hash_dict = client_dict[addr]

    if hash not in hash_dict:
        hash_dict[hash] = []

    hash_dict[hash].append((float(time), addr, iptp, userag, usertp, hash, vlanid))


for addr, hash_dict in client_dict.items():
    for hash, hits_list in hash_dict.items():
        hits_list_sorted = sorted(hits_list, key=lambda item: item[0])

        for (time, addr, iptp, userag,usertp,hash,vlanid) in hits_list_sorted:

                xxxxxxxx[Dont know how to do the calculation]

                fd_out.write("%f\t%s\t%s\t%s\n" % (addr, hash, timeinternal))

Answer 1

像这样的东西

from itertools import groupby

for video, group in groupby(sorted(data, key=lambda x: x[1]), key=lambda x: x[1]):
    times = [x[2] for x in group]
    print 'Video: %s, interval: %f' % (video, max(times) - min(times))

python中的列表计算

1 个答案: