我目前正在尝试从这个网站解析 href:https://jobs.gecareers.com/global/en/search-results?from=0&s=1
如果你打开网站,你应该看到一个职位的标题,点击上面的检查元素,你应该看到里面有一个 A 标签和 href,我正在尝试链接并将其放入列表中。
>from selenium import webdriver
from selenium.webdriver.chrome.webdriver import WebDriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
PATH = "D:\Criver\chromedriver.exe"
driver = webdriver.Chrome(PATH)
LIST = []
driver.get(f'https://jobs.gecareers.com/global/en/search-results?from=0&s=1')
#links=driver.find_elements_by_tag_name("a.job_click")
elements = WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located((By.ID, "a.href"))
)
for ele in elements:
LIST.append(ele.text)
print(LIST)
我不是 selenium 的专家,但我之前已经使用过它,但由于某种原因,我无法让 selenium 获取标签内的 href 链接。我应该怎么做?
答案 0 :(得分:0)
使用PARTIAL_LINK_TEXT
job_link = driver.find_element_by_partial_link_text('SAVED JOBS')
job_link.click()
答案 1 :(得分:0)
如果你的定位器是错误的,它是一个 css_selector,而不是 ID。
试试这个,效果应该会更好
WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'a[data-ph-at-id="job-link"]'))
time.sleep(5)
elements = driver.find_elements_by_css_selector('a[data-ph-at-id="job-link"]')
for ele in elements:
LIST.append(ele.text)
print(LIST)
上面的代码收集这些元素文本。如果你想获取 href
值,即链接,你应该在最后的代码部分使用它:
for ele in elements:
LIST.append(ele.get_attribute('href'))