我有这个网站https://www.inc.com/inc5000/list/2017,我希望我的脚本在PAGE字段中插入一个数字并单击GO,但我一直收到错误:
File "/Users/anhvangiang/Desktop/PY/inc.py", line 34, in scrape
driver.find_element_by_xpath('//*[@id="page-input-button"]').click()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py", line 77, in click
self._execute(Command.CLICK_ELEMENT)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/selenium/webdriver/remote/webelement.py", line 493, in _execute
return self._parent.execute(command, params)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error:
Element is not clickable at point (741, 697)
(Session info: chrome=61.0.3163.100)
(Driver info: chromedriver=2.30.477690
(c53f4ad87510ee97b5c3425a14c0e79780cdf262),platform=Mac OS X 10.12.6 x86_64)
这是我的代码:
def scrape(num):
driver = webdriver.Chrome('/Users/anhvangiang/Desktop/PY/chromedriver')
driver.get(main_site)
driver.find_element_by_id('page-input-field').send_keys(str(num))
driver.find_element_by_xpath('//*[@id="Welcome-59"]/div[2]/div[1]/span[2]').click()
time.sleep(5)
driver.find_element_by_xpath('//*[@id="page-input-button"]').click()
soup = BS(driver.page_source, 'lxml')
container = soup.find('section', {'id': 'data-container'})
return [source + tag.find('div', {'class': 'col i5 company'}).find('a')['href'] for tag in container.findAll('div', {'class': 'row'})]
如果我把函数scrape放在循环中:
for i in range(1, 100):
print scrape(i)
对于少数第一个i,它会顺利进行,但之后会抛出上面的错误。
有什么建议我可以解决它吗?
答案 0 :(得分:2)
这是因为此时按钮不可见,因此selenium WebDriver无法访问该按钮。 当我在本地计算机上运行您的代码时,我发现该网站显示了一个弹出广告15-20秒(参见附图:Popup_Ad),这是此错误的实际原因。要解决此错误,您必须处理弹出式广告,您可以这样做。
检查SKIP按钮,如果按钮存在,则首先跳过添加依据 单击跳过按钮,然后按照正常的代码流程进行操作。
其他建议:您应该使用 WebDriverWait 来避免找不到元素且元素无法点击的问题。例如,您可以将上面的代码编写为
from selenium import webdriver
from bs4 import BeautifulSoup as BS
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support import expected_conditions as EC
import time
def scrape(num,wdriver):
# define webdriver
driver = wdriver
# naviagte to url
driver.get("https://www.inc.com/inc5000/list/2017")
# define default wait time
wait = WebDriverWait(driver, 10)
while True:
if EC.presence_of_all_elements_located:
break
else:
continue
# handle Ad Popup
try:
skip_button = wait.until(EC.element_to_be_clickable((By.XPATH,"//*[@id='Welcome-59']/div[2]/div[1]/span[2]")))
skip_button.click()
print("\nSkip Button Clicked")
except TimeoutException:
pass
time.sleep(5)
# find the page number input field
page_num_elem = wait.until(EC.visibility_of_element_located((By.ID,"page-input-field")))
page_num_elem.clear()
page_num_elem.send_keys(num)
time.sleep(2)
while True:
try:
# find go button
go_button = wait.until(EC.element_to_be_clickable((By.ID, "page-input-button")))
go_button.click()
break
except TimeoutException :
print("Retrying...")
continue
# scrape data
soup = BS(driver.page_source, 'lxml')
container = soup.find('section', {'id': 'data-container'})
return [tag.find('div', {'class': 'col i5 company'}).find('a')['href'] for tag in container.findAll('div', {'class': 'row'})]
if __name__ == "__main__":
# create webdriver instance
driver = webdriver.Chrome()
for num in range(5):
print("\nPage Number : %s" % (num+1))
print(scrape(num,driver))
print(90*"-")
driver.quit()
答案 1 :(得分:-1)
它对我有用:
动作动作=新动作(Browser.WebDriver); action.MoveToElement(元件)。单击()执行();