我是使用Scrapy + Selenium的新手。我的蜘蛛如下:
import scrapy
from selenium import webdriver
class ProductSpider(scrapy.Spider):
name = "hotel"
allowed_domains = ['http://www.booking.com/']
start_urls = [
'http://www.booking.com/hotel/in/archana-residency.en-gb.html'
]
def __init__(self):
self.driver = webdriver.Firefox()
def parse(self, response):
self.driver.get(response.url)
next = self.driver.find_element_by_xpath('//strong[contains(@class, "b-tooltip-with-price-breakdown-tracker") and contains(@class, "rooms-table-room-price") and contains(@class, "red-actual-rack-rate")]')
print(next)
try:
next.click()
filename = 'example.txt'
with open(filename, 'wb') as f:
f.write(next.text)
except:
pass
self.driver.close()
但我每次都会收到这些日志:
2016-06-09 16:38:04 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2016-06-09 16:38:04 [scrapy] ERROR: Spider error processing <GET http://www.booking.com/hotel/in/archana-residency.en-gb.html> (referer: None)
Traceback (most recent call last):
File "/Users/Nik/Desktop/myenv/lib/python2.7/site- packages/twisted/internet/defer.py", line 588, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Users/Nik/Desktop/myenv/scrapy_project/scrapy_project/spiders/stackoverflow_spider.py", line 19, in parse
next = self.driver.find_element_by_xpath('//strong[contains(@class, "b-tooltip-with-price-breakdown-tracker") and contains(@class, "rooms-table-room-price")]')
File "/Users/Nik/Desktop/myenv/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 293, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "/Users/Nik/Desktop/myenv/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 745, in find_element
{'using': by, 'value': value})['value']
File "/Users/Nik/Desktop/myenv/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 236, in execute
self.error_handler.check_response(response)
File "/Users/Nik/Desktop/myenv/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
NoSuchElementException: Message: Unable to locate element: {"method":"xpath","selector":"//strong[contains(@class, \"b-tooltip-with-price-breakdown-tracker\") and contains(@class, \"rooms-table-room-price\")]"}
<strong
data-price-without-addons="Rs. 3,470"
data-price-with-parking=""
data-price-with-internet=""
data-price-with-internet-parking=""
class=" b-tooltip-with-price-breakdown-tracker rooms-table-room-price red-actual-rack-rate"
title=""
>
Rs. 3,470
</strong>
我也试过&#34; driver.find_element_by_css_selector&#34;但它仍然给出了这个错误。我想知道我使用的xpaths有什么不对,或者有些东西我不知道。
提前致谢!
编辑:大家好,已经找到了这个问题。对于那些像我一样面临问题的人,请使用WebdriverWait,因为它等待指定的时间,以便页面进行所有必须进行的调用,以便数据最终可用。例如:element = WebDriverWait(self.driver, 15).until(
EC.presence_of_all_elements_located((By.CLASS_NAME, "example class name"))
)