Selenium-无法在页面源中找到元素

时间:2019-09-18 14:36:55

标签: python selenium selenium-webdriver xpath webdriverwait

我正在尝试使用Selenium爬行网页,但是由于某些原因,我需要的元素没有显示在页面源中

我一直尝试使用WebDriverWait,直到页面加载为止。我还尝试查看数据是否位于我需要切换到的其他帧中。

driver.get('https://foreclosures.cabarruscounty.us/')

try:
    WebDriverWait(driver, delay).until(EC.presence_of_element_located((By.XPATH,'//*[@id="app"]/div[5]/div/div')))
    print("Page is ready!")

    web_url = driver.page_source
    print(web_url)

except TimeoutException:
    print("Loading took too much time!")

我希望可以看到每个个人财产卡的所有记录,然后提取。但是,页面源不显示任何这些数据。

如果我手动加载网页并检查源,则数据不存在view-source:https://foreclosures.cabarruscounty.us/

3 个答案:

答案 0 :(得分:1)

尝试下面的代码。它将返回所有元素。使用visibility_of_all_elements_located()

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver

driver=webdriver.Chrome()
driver.get("https://foreclosures.cabarruscounty.us/")
elements=WebDriverWait(driver,30).until(EC.visibility_of_all_elements_located((By.XPATH,"//div[@id='app']//div[@class='card-body']/div[1]")))
allrecord=[ele.text for ele in elements]
print(allrecord) #it will give you all record.

如果仅打印第一个元素值。

print(allrecord[0].splitlines())

您将获得以下输出:

['Real ID: 04-086 -0040.00', 'Status: SALE SCHEDULED', 'Case Number: 18-CVD-2804', 'Tax Value: $29,660', 'Min Bid: $10,067', 'Sale Date: 10/03/2019', 'Sale Time: 12:00 PM', 'Owner: DOUGLAS JAMES W', 'Attorney: ZACCHAEUS LEGAL SVCS']

答案 1 :(得分:1)

要提取第一个 Real ID 案例编号所有者字段,您必须为 WebDriverWait visibility_of_element_located(),您可以使用以下Locator Strategies

  • 代码块:

    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.support import expected_conditions as EC
    
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument("start-maximized")
    chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
    chrome_options.add_experimental_option('useAutomationExtension', False)
    driver = webdriver.Chrome(options=chrome_options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
    driver.get("https://foreclosures.cabarruscounty.us/");
    Real_ID = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='row']//div[@class='card cardClass']/img//following::div[@class='card-body']//div/b"))).text
    Case_Number = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='row']//div[@class='card cardClass']/img//following::div[@class='card-body']//div//following-sibling::b[2]"))).text
    Owner = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='row']//div[@class='card cardClass']/img//following::div[@class='card-body']//div//following-sibling::b[7]"))).text
    print("{} is {} owned by {}".format(Real_ID,Case_Number,Owner))
    driver.quit()
    
  • 控制台输出:

    Real ID: 04-086 -0040.00 is Case Number: 18-CVD-2804 owned by Owner: DOUGLAS JAMES W
    

答案 2 :(得分:0)

您可以使用ImplicitWait和PageLoad来等待元素:

//For 30 seconds
driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromSeconds(30);
driver.Manage().Timeouts().PageLoad = TimeSpan.FromSeconds(30);

此代码适用于C#和Selenium