我正在尝试进行网络抓取。我正在将包含URL的列表传递给pool.starmap,并且遇到参数错误。显示我的代码的简化版本:
有人可以帮我吗? 抱歉,如果我做傻事。
from multiprocessing.dummy import Pool
def func(x):
print(x)
s = ["cat","foo","bar","you","and","me"] #this list contains ~50 URLs in actual code
with Pool() as pool:
pool.starmap(func,s)
这给了我错误:
Traceback (most recent call last):
File "g.py", line 8, in <module>
pool.starmap(func,s)
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 274, in starmap return
self._map_async(func, iterable, starmapstar, chunksize).get()
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 644, in get
raise self._value
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
TypeError: func() takes 1 positional argument but 3 were given
答案 0 :(得分:3)
starmap
需要可迭代的列表。选中here。因此,请改用map
。
答案 1 :(得分:1)
您需要使用常规的map()
尝试:
from multiprocessing.dummy import Pool
def func(x):
print(x)
s = ["cat","foo","bar","you","and","me"] #this list contains ~50 URLs in actual code
with Pool() as pool:
pool.map(func,s)
starmap()
期望列表中的每个元素本身都是可迭代的,并将内部可迭代元素的args传递给func
您的元素是字符串,可以按字符进行迭代,
因此第一个元素的星图称为func('c', 'a', 't')
,等等...
这是两个函数的定义+文档字符串:
def map(self, func, iterable, chunksize=None):
'''
Apply `func` to each element in `iterable`, collecting the results
in a list that is returned.
'''
return self._map_async(func, iterable, mapstar, chunksize).get()
def starmap(self, func, iterable, chunksize=None):
'''
Like `map()` method but the elements of the `iterable` are expected to
be iterables as well and will be unpacked as arguments. Hence
`func` and (a, b) becomes func(a, b).
'''
return self._map_async(func, iterable, starmapstar, chunksize).get()