使用python多线程下载循环

时间:2011-02-01 00:32:56

标签: python multithreading loops fetch

我有一份清单。

symbols = ('GGP', 'JPM', 'AIG', 'AMZN','GGP', 'rx', 'jnj', 'osip')

URL = "http://www.Xxxx_symbol=%s"

def fetch(symbols):
    try:
        url = URL % '+'.join(symbols)
        fp = urllib2.urlopen(url)
        try:
            data = fp.read()

        finally:
            fp.close()
        return data
    except Exception as e:
        print "No Internet Access" 

我正在尝试多线程(使用4个线程)获取进程,而不是多进程而不是使用twisted。 Url fetch的输出文件是csv,有7行标题信息,我想摆脱它。我想将每个符号循环到它自己的文件中。之前我使用过这个获取代码。我可以得到一个有一个元素的符号列表。

1 个答案:

答案 0 :(得分:4)

这应该让你开始:

from threading import Thread, Lock

data = {}
data_lock = Lock()

class Fetcher(Thread):
    def __init__(self, symbol):
        super(Thread, self).__init__()
        Thread.__init__(self)
        self.symbol = symbol

    def run(self):
        # put the code from fetch() here
        # replace 'data = fp.read()' with the following
        tmp = fp.read()
        data_lock.acquire()
        data[self.symbol] = tmp
        data_lock.release()

# Start a new Fetcher thread like this:
fetcher = Fetcher(symbol)
fetcher.start()
# To wait for the thread to finish, use Thread.join():
fetcher.join()