Question

我正在尝试编写使用单独线程下载网页的python代码。以下是我的代码示例：

import urllib2
from threading import Thread
import time

URLs = ['http://www.yahoo.com/',
        'http://www.time.com/',
        'http://www.cnn.com/',
        'http://www.slashdot.org/'
       ]


def thread_func(arg):
    t = time.time()
    page = urllib2.urlopen(arg)
    page = page.read()
    print time.time() - t




for url in URLs:
    t = Thread(target = thread_func, args = (url, ))
    t.start()
    t.join()

我运行代码并且线程似乎是串行执行的，如果我没有弄错的话，测量下载的时间但是在一定时间后每个都输出到控制台。我能正确编码吗？

Answer 1

对t.join()的调用会阻止当前线程，直到目标线程结束。您在创建线程后立即调用它，因此您一次不会运行多个下载程序线程。

将您的代码更改为：

threads = []
for url in URLs:
    t = Thread(target = thread_func, args = (url, ))
    t.start()
    threads.append(t)

# All threads started, now wait for them to finish
for t in threads:
    t.join()

提高python多线程下载网页的性能

1 个答案: