我正在尝试使用Selenium从页面抓取数据。我上周做了,但是本周发生了一些变化,现在不起作用了。问题是您可以在网站上看到“显示更多”按钮或“Prikažibroj”。我要抓取多页,但我们只关注其中一页。
代码是:
options = Options()
options.headless = True
driver = webdriver.Chrome('/Users/Nenad/chromedriver', options=options)
driver.get('https://www.nekretnine.rs/stambeni-objekti/stanovi/zvezdara-konjarnik-milica-rakica-57m2-milica-rakica/NkJXDiY2ugE/')
try:
element = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(2) > button:nth-child(2)').click()
sleep(randint(3, 5))
home_phone = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(2) > span:nth-child(1)')
condo_agency_cell_phones.append(home_phone.text)
except:
condo_agency_cell_phones.append('NaN')
try:
element = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(4) > button:nth-child(2)').click()
sleep(randint(3, 5))
cell_phone = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(4) > span:nth-child(1)')
condo_agency_cell_phones.append(cell_phone.text)
except:
condo_agency_cell_phones.append('NaN')
driver.close()
上周它与xpath一起使用,但是现在不行了。我什至找到了一个按钮,但没有单击:
options = Options()
options.headless = False
driver = webdriver.Chrome('/Users/Nenad/chromedriver', options=options)
driver.get('https://www.nekretnine.rs/stambeni-objekti/stanovi/zvezdara-konjarnik-milica-rakica-57m2-milica-rakica/NkJXDiY2ugE/')
sleep(20)
try:
element = driver.find_element_by_xpath("//button\[@type='button'\]").click()
print(element.text)
except:
print('NaN')
答案 0 :(得分:1)
尝试使用CSS选择器find_element_by_css_selector(button[type="button"])
答案 1 :(得分:0)
如果第一个答案不能解决您的问题,请尝试此操作。导入了一些不同的库。在您上面的代码中,“ try:”由于未导入库而未定义变量返回错误。
from selenium import webdriver
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options
from time import sleep
options = Options()
options.headless = True
driver = webdriver.Chrome('/Users/Nenad/chromedriver', options=options)
# driver = webdriver.Firefox(executable_path=r'C:\\Py\\geckodriver.exe');
driver.get('https://www.nekretnine.rs/stambeni-objekti/stanovi/zvezdara-konjarnik-milica-rakica-57m2-milica-rakica/NkJXDiY2ugE/')
condo_agency_cell_phones = []
try:
element = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(2) > button:nth-child(2)').click()
# sleep(randint(3, 5))
sleep(4)
# home_phone1 = driver.find_element_by_xpath("html/body/div[11]/div[1]/div[2]/div[1]/div/div[2]/div[2]/div/div/form[1]/span")
# condo_agency_cell_phones.append(home_phone1.text)
home_phone = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(2) > span:nth-child(1)')
print(home_phone.text)
condo_agency_cell_phones.append(home_phone.text)
except:
condo_agency_cell_phones.append('NaN')
try:
element = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(4) > button:nth-child(2)').click()
# sleep(randint(3, 5))
sleep ( 4 )
cell_phone = driver.find_element_by_css_selector('div.row:nth-child(2) > div:nth-child(2) > div:nth-child(1) > div:nth-child(1) > form:nth-child(4) > span:nth-child(1)')
condo_agency_cell_phones.append(cell_phone.text)
except:
condo_agency_cell_phones.append('NaN')
print(condo_agency_cell_phones)
driver.close()