Question

我想每5秒获取一个网页的源代码，我想多次这样做。如3次，则总时间为3 * 5 = 15秒。我写了以下代码：

import urllib2
import threading

def getWebSource(url):
    usock = urllib2.urlopen(url)
    data = usock.read()
    usock.close()

    with open("WebData.txt", "a") as myfile:
        myfile.write(data)
        myfile.write("\n\n")



url = 'http://www.google.com/'
n = 3
while n>0:
    t = threading.Timer(5.0, getWebSource,args = [url]) # set the seconds here
    t.start()
    n = n-1

然而，当我运行它时，我得到的是：它只运行5秒并阅读网页3次。这有什么问题？我希望它应该每5秒阅读一次网页并重复3次。

＃

更新：感谢@wckd，这是我的最终代码：

    import urllib2
    import time
    from time import gmtime, strftime

    def getWebSource(url,fout,seconds,times):
        while times > 0:
            usock = urllib2.urlopen(url)
            data = usock.read()
            usock.close()

            currentTime = strftime("%Y-%m-%d %H:%M:%S", gmtime())
            fout.write(currentTime +"\n")
            fout.write(data)
            fout.write("\n\n")
            times = times - 1
            time.sleep(seconds)



url = 'http://www.google.com'
fout = open("WebData.txt", "a")
seconds = 5
times = 4

getWebSource(url,fout,seconds,times)

Answer 1

方法threading.Timer（）只是创建一个在指定的时间后启动的线程。当该线程等待运行时，循环继续运行。基本上你有三个线程都会在5秒后运行。

如果你想拥有间距，你可以使getWebSource成为一个带有倒计时的递归函数，它会在运行时启动新线程。或者如果你想继续做你做的事情，你可以将5乘以n得到一个间距。我不推荐这个，因为如果你尝试100次，你将拥有100个线程。

更新

在一个线程中执行此操作的最简单方法是在循环中添加等待调用（也称为睡眠）

while n > 0
time.sleep(5)
yourMethodHere()
end

但是，由于您的方法需要一段时间才能运行，请将线程创建保留在那里并将其设置为等待0秒。

while n > 0
time.sleep(5)
threading.Timer(0, yourMethodHere())
n = n - 1
end

通过这种方式，您不会受到连接不良或某些因素减慢的限制。

每5秒获取一次网页数据并运行一定时间

＃

1 个答案: