我正在尝试浏览以下URL上的下拉菜单:https://www.accuform.com/safety-sign/danger-danger-authorized-personnel-only-MADM006
例如,在选项下的第一个下拉菜单列出了不同的材料,我想依次选择每个材料,然后从网页中收集其他一些信息,然后再继续研究下一个材料。这是我当前的代码:
driver = webdriver.Firefox()
driver.get('https://www.accuform.com/safety-sign/danger-danger-authorized-personnel-only-MADM006')
time.sleep(3)
driver.find_element_by_id('x-mark-icon').click()
select = Select(driver.find_element_by_name('Wiqj7mb4rsAq9LB'))
options = select.options
optionsList = []
driver.find_elements_by_class_name('select-wrapper')[0].click()
element = driver.find_element_by_xpath("//select[@name='Wiqj7mb4rsAq9LB']")
actions = ActionChains(driver)
actions.move_to_element(element).perform()
# driver.execute_script("arguments[0].scrollIntoView();", element)
for option in options: #iterate over the options, place attribute value in list
optionsList.append(option.get_attribute("value"))
for optionValue in optionsList:
print("starting loop on option %s" % optionValue)
# select = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.XPATH, "//select[@name='Wiqj7mb4rsAq9LB']")))
# select = Select(select)
select.select_by_value(optionValue)
我只是从循环开始,但是遇到了这个错误:
ElementNotInteractableException: Message: Element <option> could not be scrolled into view
然后我添加了webdriverwait并收到TimeoutException错误。
然后我意识到我可能应该单击包含下拉菜单的包装器,所以我添加了单击,它可以弹出菜单,但是仍然出现TimeoutException。
所以我想,也许我应该移至该元素,该元素在我用动作链线尝试过后出现了这个错误
WebDriverException: Message: TypeError: rect is undefined
我尝试通过使用以下代码来避免该错误:
# driver.execute_script("arguments[0].scrollIntoView();", element)
只是再次导致timeoutexception。
我对Python和Selenium相当陌生,基本上只是修改了SO对类似问题的答案中的代码,但没有任何效果。
我正在使用python 3.6以及Selenium和firefox Webdriver的当前版本。
如果不清楚,或者您需要更多信息,请告诉我。
非常感谢!
编辑:基于Kajal Kunda的回答和评论,我将代码更新为以下内容:
`material_dropdown = driver.find_element_by_xpath("//input[@class='select-
dropdown']")
driver.execute_script("arguments[0].click();", material_dropdown)
materials=driver.find_elements_by_css_selector("div.select-wrapper
ul.dropdown-content li")
for material in materials:
# material_dropdown =
driver.find_element_by_xpath("//input[@class='select-dropdown']")
# driver.execute_script("arguments[0].click();", material_dropdown)
# materials=driver.find_elements_by_css_selector("div.select-wrapper ul.dropdown-content li")
material_ele=material.find_element_by_tag_name('span')
if material_ele.text!='':
material_ele.click()
time.sleep(5)
price = driver.find_element_by_class_name("dataPriceDisplay")
print(price.text)`
结果是它成功打印出了第一类材料的价格,但随后返回:
StaleElementReferenceException: Message: The element reference of <li class=""> is stale;...
我已经尝试过在循环内外添加散列行的变体,但始终会得到StaleElementReferenceException错误的版本。
有什么建议吗?
谢谢!
答案 0 :(得分:1)
您可以使用requests
完成全部操作。从下拉列表中列出的选项中获取下拉列表,然后将value
属性连接到请求url中,该URL检索包含页面上所有信息的json。添加其他下拉值的原理相同。每个下拉选择的ID是下拉菜单中选项的value
属性,并显示在我显示的网址中,每个下拉选择的ID由//
分隔。
import requests
from bs4 import BeautifulSoup as bs
url = 'https://www.accuform.com/product/getSku/danger-danger-authorized-personnel-only-MADM006/1/false/null//{}//WHFIw3xXmQx8zlz//6wr93DdrFo5JV//WdnO0RpwKpc4fGF'
startURL = 'https://www.accuform.com/safety-sign/danger-danger-authorized-personnel-only-MADM006'
res = requests.get(startURL)
soup = bs(res.content, 'lxml')
materials = [item['value'] for item in soup.select('#Wiqj7mb4rsAq9LB option')]
sizes = [item['value'] for item in soup.select('#WvXESrTyQjM3Ciw option')]
languages = [item['value'] for item in soup.select('#WUYWGMePtpmpmhy option')]
units = [item['value'] for item in soup.select('#W91eqaJ0WPXwe9b option')]
for material in materials:
data = requests.get(url.format(material)).json()
soup = bs(data['dataMaterialBullets'], 'lxml')
lines = [item.text for item in soup.select('li')]
print(lines)
print(data['dataPriceDisplay'])
# etc......
JSON示例:
答案 1 :(得分:0)
尝试以下代码。应该可以。
driver = webdriver.Firefox()
driver.get('https://www.accuform.com/safety-sign/danger-danger-authorized-personnel-only-MADM006')
time.sleep(3)
driver.find_element_by_id('x-mark-icon').click()
material_dropdown = driver.find_element_by_xpath("//input[@class='select-dropdown']")
driver.execute_script("arguments[0].click();", material_dropdown)
#Code for material dropdown
materials=driver.find_elements_by_css_selector("div.select-wrapper ul.dropdown-content li")
material_optionsList = []
for material in materials:
material_ele=material.find_element_by_tag_name('span')
if material_ele.text!='':
material_optionsList.append(material_ele.text)
print(material_optionsList)
driver.execute_script("arguments[0].click();", material_dropdown)
size_dropdown = driver.find_element_by_xpath("(//input[@class='select-dropdown'])[2]")
driver.execute_script("arguments[0].click();", size_dropdown)
#Code for size dropdown
Sizes=driver.find_elements_by_css_selector("div.select-wrapper ul.dropdown-content li")
size_optionsList = []
for size in Sizes:
size_ele=size.find_element_by_tag_name('span')
if size_ele.text!='':
size_optionsList.append(size_ele.text)
driver.execute_script("arguments[0].click();", size_dropdown)
输出:
[u'Adhesive Vinyl', u'Plastic', u'Adhesive Dura-Vinyl', u'Aluminum', u'Dura-Plastic\u2122', u'Aluma-Lite\u2122', u'Dura-Fiberglass\u2122', u'Accu-Shield\u2122']
希望您会做剩下的工作。让我知道它是否对您有用。
编辑代码以循环浏览并获取材料的价格值。
for material in range(len(materials)):
material_ele=materials[material]
if material_ele.text!='':
#material_optionsList.append(material_ele.text)
#material_ele.click()
driver.execute_script("arguments[0].click();", material_ele)
time.sleep(2)
price = driver.find_element_by_id("priceDisplay")
print( price.text)
time.sleep(2)
material_dropdown = driver.find_element_by_xpath("//input[@class='select-dropdown']")
driver.execute_script("arguments[0].click();", material_dropdown)
materials = driver.find_elements_by_css_selector("div.select-wrapper ul.dropdown-content li")
material+=2
输出:
$8.31
$9.06
$13.22
$15.91
$15.91