在多处理池中传递字符串列表时出错

时间:2019-06-20 13:10:24

标签: python-3.x

我正在尝试进行网络抓取。我正在将包含URL的列表传递给pool.starmap,并且遇到参数错误。显示我的代码的简化版本:

有人可以帮我吗? 抱歉,如果我做傻事。


from multiprocessing.dummy import Pool

def func(x):
    print(x)

s = ["cat","foo","bar","you","and","me"] #this list contains ~50 URLs in actual code

with Pool() as pool:
    pool.starmap(func,s)

这给了我错误:

Traceback (most recent call last):                                                                                                          
File "g.py", line 8, in <module>                                                                                                            
  pool.starmap(func,s)                                                                                                                    
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 274, in starmap                                                          return 
  self._map_async(func, iterable, starmapstar, chunksize).get()                                                                    
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 644, in get                                                              
  raise self._value                                                                                                                       
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 119, in worker
  result = (True, func(*args, **kwds))                                                                                                    
File "C:\Users\Gunjan\Anaconda3\lib\multiprocessing\pool.py", line 47, in starmapstar
  return list(itertools.starmap(args[0], args[1]))                                                                                      
TypeError: func() takes 1 positional argument but 3 were given

2 个答案:

答案 0 :(得分:3)

starmap需要可迭代的列表。选中here。因此,请改用map

答案 1 :(得分:1)

您需要使用常规的map()

尝试:

from multiprocessing.dummy import Pool

def func(x):
    print(x)

s = ["cat","foo","bar","you","and","me"] #this list contains ~50 URLs in actual code

with Pool() as pool:
    pool.map(func,s)

starmap()期望列表中的每个元素本身都是可迭代的,并将内部可迭代元素的args传递给func

您的元素是字符串,可以按字符进行迭代, 因此第一个元素的星图称为func('c', 'a', 't'),等等...

这是两个函数的定义+文档字符串:

    def map(self, func, iterable, chunksize=None):
        '''
        Apply `func` to each element in `iterable`, collecting the results
        in a list that is returned.
        '''
        return self._map_async(func, iterable, mapstar, chunksize).get()

    def starmap(self, func, iterable, chunksize=None):
        '''
        Like `map()` method but the elements of the `iterable` are expected to
        be iterables as well and will be unpacked as arguments. Hence
        `func` and (a, b) becomes func(a, b).
        '''
        return self._map_async(func, iterable, starmapstar, chunksize).get()