Question

我想在python中使用多线程来减少使用urllib2.urlopen（url）加载网页的不同子页面的时间。我想要加载的页面通常需要4.5秒。所以我的计划是使用多线程并行加载同一页面的不同子页面，并减少整体加载时间。

我写了这个例子，令人惊讶的是，多线程需要的时间比＆＃34;普通＆＃34;版本一个接一个。

我做错了什么？

import urllib.request as urllib2
from multiprocessing.dummy import Pool as ThreadPool
import time

urls = [
  'http://www.spiegel.de', 'http://www.spiegel.de',  'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de', 'http://www.spiegel.de'
  ]
start = time.time()

for url in urls:
    results = urllib2.urlopen(url)
    print(results)
end = time.time()
print("elapsed time no threading: ", end - start)

start = time.time()

# Make the Pool of workers
pool = ThreadPool(4)

# Open the urls in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)
print(results)

#close the pool and wait for the work to finish
pool.close()
pool.join()

end = time.time()
print("elapsed time threading: ", end - start)

结果：

<http.client.HTTPResponse object at 0x02AF4BF0>
...
<http.client.HTTPResponse object at 0x02AF4C10>
elapsed time no threading:  0.9750549793243408
[<http.client.HTTPResponse object at 0x02B11330>, ... <http.client.HTTPResponse object at 0x02B3F8D0>]
elapsed time threading:  3.4091949462890625

为什么python中的urllib2.urlopen（url）在使用多线程时不会更快但速度更慢？

0 个答案: