应用错误收集

我有一个包含链接的列表，我想使用5个线程来抓取所有链接。我设法为每个链接做一个线程，但我将有超过1000个链接，我不希望我的电脑和网站中断。

我的问题是如何使用固定数量的线程来抓取100个链接？

这就是我现在所得到的：

    def main():
       urls = ["http://google.com","http://yahoo.com"]
       threads = []

    #Starting all the requests 
       for url in urls:
            thread = threading.Thread(target=loadurl, args=(url,))
        thread.start()
        print "[+] Thread started for:", url
        threads.append(thread)

    #All requests started
       print "[+] Requests done"
       for thread in threads:
        thread.join()
       print "[+] Finished!"

    #Just print the source
    def loadurl(url):
       page = urllib2.urlopen(url);
       soup = BeautifulSoup(page); 
       print soup

线程准备好后如何启动新线程？ python2.7

0 个答案: