我正在用selenium编程一个Instagram机器人,该机器人遍历X个用户帖子,并获取每个帖子的喜欢次数。它还会复制帖子的链接。然后,这两个值(“喜欢”和“链接”)都附加到字典中。问题在于,有时此功能永远无法遍历所有帖子,因为wifi加载X帖子需要花费更长的时间,并且整个代码中断或我不知道的其他时候。
如果你们可以帮助我改善usermostlikedposts()
的功能,或者如果您提出另一种建议的方法,我将不胜感激。
class InstagramBot():
def __init__(self, username, password):
self.browser = webdriver.Chrome(chrome_driver_binary, options=options)
self.username = username
self.password = password
self.likesposts = {}
self.login()
self.usermostlikedposts('datos.inc', 42)
def usermostlikedposts(self, username, nofposts):
self.nav_user(username)
for i in range(nofposts):
try:
post = self.browser.find_element_by_xpath('(//div[@class=\'eLAPa\']//parent::a)[{}]'.format(i+1))
time.sleep(2)
post.click()
time.sleep(2)
likes = self.browser.find_element_by_xpath('/html/body/div[4]/div[2]/div/article/div[2]/section[2]/div/div/button/span').get_attribute("innerHTML")
link = post.get_attribute("href")
self.likesposts[likes] = link
time.sleep(2)
self.browser.find_element_by_xpath('/html/body/div[4]/div[3]/button').click()
except:
continue
tmp = {}
i = 1
for key in sorted(self.likesposts.keys(), key=lambda x: int(x.replace(",",""))):
tmp[key] = self.likesposts[key]
print(f'{(i)}) {key}: {self.likesposts[key]}')
i += 1
self.likesposts = tmp
if __name__ == '__main__':
bot = InstagramBot('clubdelmorfi', 'Ob6ft471324')
这是我现在得到的输出:
1) 766: https://www.instagram.com/POSTURL/
2) 11,461: https://www.instagram.com/POSTURL/
3) 21,490: https://www.instagram.com/POSTURL/
4) 21,839: https://www.instagram.com/POSTURL/
5) 29,356: https://www.instagram.com/POSTULR/
6) 50,650: https://www.instagram.com/POSTURL/
可以正常工作,但是无法完全迭代函数中指定数量的帖子。