def multi():
jobs = []
r = open('raw.txt', 'r', 16777216).read().split('\n')
for i in r:
p = mp.Process(target=all, args=(i,))
jobs.append(p)
p.start()
raw.txt的每一行都是URL
请解释我如何修改multi()到
a)将raw.txt拆分成块(比如每行10行)并将all()应用于每个块并
b)最后返回处理过的行/块的数量
谢谢,
答案 0 :(得分:0)
看看itertools
包它有很多有用的东西。
>>> with open('input.txt', 'w') as f:
... for i in xrange(998):
... f.write(uuid.uuid4().get_hex() + '\n')
...
>>>
>>> from itertools import groupby, count
>>> with open('input.txt', 'r') as f:
... samples = groupby(f, key=lambda k, line=count(): next(line)//100)
... for i in samples:
... print i
...
(0, <itertools._grouper object at 0x7f174f170c50>)
(1, <itertools._grouper object at 0x7f1740804f50>)
(2, <itertools._grouper object at 0x7f174f170c50>)
(3, <itertools._grouper object at 0x7f1740804f50>)
(4, <itertools._grouper object at 0x7f174f170c50>)
(5, <itertools._grouper object at 0x7f1740804f50>)
(6, <itertools._grouper object at 0x7f174f170c50>)
(7, <itertools._grouper object at 0x7f1740804f50>)
(8, <itertools._grouper object at 0x7f174f170c50>)
(9, <itertools._grouper object at 0x7f1740804f50>)