我在一个帖子中传递了一个链接作为参数,我想要抓住时间戳。但是在线程指向的函数中,每次重新编写时,时间戳值都不会改变。你如何让timeLink成为动态的,并在每次循环时改变它?这是代码:
def abcStart(timeLink):
while True:
res = timeLink
res.raise_for_status()
timestamp = BeautifulSoup(res.content, 'html.parser').find_all('b')
if timestamp[0].text == otherTimestamp[0].text:
work on something
break
if timestamp[0].text > otherTimestamp[0].text:
continue
else:
print('not yet')
time.sleep(30)
break
timelink = requests.get('http://example.com/somelink')
threadobj = threading.Thread(target=abcStart, args=(timelink))
threadobj.start()
threadobj.join()
答案 0 :(得分:1)
我想你应该在你的函数中移动timeLink请求:
def abcStart(timeLink):
while True:
res = requests.get('http://example.com/somelink')
res.raise_for_status()
timestamp = BeautifulSoup(res.content, 'html.parser').find_all('b')
if timestamp[0].text == otherTimestamp[0].text:
work on something
break
if timestamp[0].text > otherTimestamp[0].text:
continue
else:
print('not yet')
time.sleep(30)
break
threadobj = threading.Thread(target=abcStart, args=())
threadobj.start()
threadobj.join()
答案 1 :(得分:1)
看起来只有一个http请求被发送。在这一行:
timelink = requests.get('http://example.com/somelink')
abcStart()函数正在接收http响应,并在整个运行时使用该值。这将导致我们每次都刮同一页。如果我们想要为每个循环迭代抓取一个不同的页面,我们每次都需要执行另一个http请求。像这样:
def abcStart(timeLink):
while True:
res = requests.get(timeLink) # send request here
res.raise_for_status()
timestamp = BeautifulSoup(res.content, 'html.parser').find_all('b')
if timestamp[0].text == otherTimestamp[0].text:
work on something
break
if timestamp[0].text > otherTimestamp[0].text:
continue
else:
print('not yet')
time.sleep(30)
break
timeLink = 'http://example.com/somelink' # declare url
threadobj = threading.Thread(target=abcStart, args=(timelink))
threadobj.start()
threadobj.join()