我想在for循环中获取div的“href”并在每个“season”中输入“href”。 它正在发挥作用但是在第一季之后,它没有找到第二季和下一季...
在第一集之后它在season.get_attribute
中突然没有继续,我试着将其修复3个小时。
如果有人可以帮助我,请提前谢谢。
网址:http://www.tvil.me/view/181/1/1/v/הומלנד_Homeland.html
seasons_num = driver.find_elements_by_xpath("//*[contains(@id, 'change-season-')]/a")
I = 0;
for season in seasons_num:
I += 1
print("Season: %i" % (I))
try:
season_link = season.get_attribute("href") // It's breaks here
## seasons.append(season_link)
except:
print("---- Faild to fetch season %i -----" % (I))
raise
print("%s Season %i" % (post_id, I))
driver.get(season_link)
episodes = driver.find_elements_by_xpath("//*[contains(@id, 'change-episode-')]/a")
E = 0;
for episode in episodes:
E += 1
print("Post ID:%s Season:%i Episode: %i" % (post_id,I,E))
time.sleep(1)
episode_link = episode.get_attribute("href")
回应:
Post_ID: 100
Season: 1
100 Season 1
Post ID:100 Season:1 Episode: 1
Post ID:100 Season:1 Episode: 2
Post ID:100 Season:1 Episode: 3
Post ID:100 Season:1 Episode: 4
Post ID:100 Season:1 Episode: 5
Post ID:100 Season:1 Episode: 6
Post ID:100 Season:1 Episode: 7
Post ID:100 Season:1 Episode: 8
Post ID:100 Season:1 Episode: 9
Post ID:100 Season:1 Episode: 10
Post ID:100 Season:1 Episode: 11
Post ID:100 Season:1 Episode: 12
Post ID:100 Season:1 Episode: 13
Post ID:100 Season:1 Episode: 14
Post ID:100 Season:1 Episode: 15
Post ID:100 Season:1 Episode: 16
Season: 2
无法获取第2季:
Traceback (most recent call last):
File C:\Users\maorb\OneDrive\Desktop\Maor\python\serethd\tvil.me.py", line 54, in <module> season_link = season.get_attribute("href")
File "C:\Program Files (x86)\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py", line 112, in get_attribute
resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
File "C:\Program Files (x86)\Python36-32\lib\site-packages\selenium\webdriver\remote\webelement.py",
line 457, in _execute
return self._parent.execute(command, params)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\selenium\webdriver\remote\webdriver.py",
line 233, in execute
self.error_handler.check_response(response)
File "C:\Program Files (x86)\Python36-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py",
line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element not found in the cache - perhaps the page has changed
since it was looked up
Stacktrace:
at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:9454)
at Utils.getElementAt (file:///C:/Users/maorb/AppData/Local/Temp/tmp8gu_bklc/extensions/fxdriver@googlecode.com/components/command-processor.js:9039)
at WebElement.getElementAttribute (file:///C:/Users/maorb/AppData/Local/Temp/tmp8gu_bklc/extensions/fxdriver@googlecode.com/components/command-processor.js:12146)
at DelayedCommand.prototype.executeInternal_/h (file:///C:/Users/maorb/AppData/Local/Temp/tmp8gu_bklc/extensions/fxdriver@googlecode.com/components/command-processor.js:12661)
at DelayedCommand.prototype.executeInternal_ (file:///C:/Users/maorb/AppData/Local/Temp/tmp8gu_bklc/extensions/fxdriver@googlecode.com/components/command-processor.js:12666)
at DelayedCommand.prototype.execute/< (file:///C:/Users/maorb/AppData/Local/Temp/tmp8gu_bklc/extensions/fxdriver@googlecode.com/components/command-processor.js:12608)