我在这里询问了关于这个脚本的三个问题。到目前为止我有几个错误。我试图绕着这个脚本工作但是我被困在这个部分并且不知道如何解决它。
基本上我希望在网站上看到未经过删除的消息,然后再回答。我遇到了for loop
检查每个未完成的消息并保留对话的id
以便我稍后可以在网址上使用它的部分。
以下是代码:
#!/usr/bin/python
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import re
email = "xxx@gmail.com"
password = "xxxxx"
print "Openning Browser"
browser = webdriver.Firefox()
browser.get("https://olx.pt/account/?ref[0][action]=myaccount&ref[0][method]=index")
print "Logging into OLX"
elem = browser.find_element_by_name("login[email]")
elem.send_keys(email)
elem = browser.find_element_by_name("login[password]")
elem.send_keys(password)
elem.send_keys(Keys.RETURN)
print "Loged into OLX"
time.sleep(5)
browser.get("https://olx.pt/myaccount/answers/")
while browser.find_elements_by_css_selector("tr.unreaded"):
print "Unreaded messages!"
unread_answers = browser.find_elements_by_css_selector("tr.unreaded")
for unread_row in unread_answers:
row_id = unread_row.get_attribute("id")
m = re.search('answer_row_(\d+)', row_id)
row_number = m.group(1)
print row_number
print "First loop"
browser.refresh()
time.sleep(5)
else:
print "All read!"
这是输出:
Openning Browser
Logging into OLX
Loged into OLX
Unreaded messages!
315911723
First loop
Traceback (most recent call last):
File "loginolxbackup.py", line 28, in <module>
row_id = unread_row.get_attribute("id")
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 113, in get_attribute
resp = self._execute(Command.GET_ELEMENT_ATTRIBUTE, {'name': name})
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webelement.py", line 469, in _execute
return self._parent.execute(command, params)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/webdriver.py", line 201, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python2.7/dist-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: Element not found in the cache - perhaps the page has changed since it was looked up
Stacktrace:
at fxdriver.cache.getElementAt (resource://fxdriver/modules/web-element-cache.js:9407)
at Utils.getElementAt (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:8992)
at WebElement.getElementAttribute (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12099)
at DelayedCommand.prototype.executeInternal_/h (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12614)
at DelayedCommand.prototype.executeInternal_ (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12619)
at DelayedCommand.prototype.execute/< (file:///tmp/tmpvdAiKH/extensions/fxdriver@googlecode.com/components/command-processor.js:12561)
我正在看的html页面是这样的:
<tr id="answer_row_3121238" class="bla bla bla">
...
<tr id="answer_row_3121428" class="bla bla bla">
...
<tr id="answer_row_3124238" class="bla bla bla">
我已经尝试打印m
,我看到它有3个对象,这意味着它正在获取所有未被解锁的消息。
我正在撞墙而没有任何运气。任何建议/帮助都会非常令人满意。
答案 0 :(得分:0)
当您使用browser.refresh()
时,呈现DOM并且WebDriver丢失它之前定位的所有元素,以及导致异常的原因。它甚至在堆栈跟踪中: 也许页面在查找后发生了变化 。
要么避免刷新页面(我在这里看不到任何需要),要么在for
循环的每次迭代中重新定位所有消息。
每次迭代重定位的示例
unread_answers = browser.find_elements_by_css_selector("tr.unreaded")
messages_len = len(unread_answers)
for x in range(0, messages_len - 1):
unread_answers = browser.find_elements_by_css_selector("tr.unreaded")
row_id = unread_answers[x].get_attribute("id")
m = re.search('answer_row_(\d+)', row_id)
row_number = m.group(1)
print row_number
print "First loop"
browser.refresh()
time.sleep(5)