我正在使用Python 3.6和Beautiful Soup制作Crawler。这是我的代码
当我运行它时,我找不到元素异常,为什么?我想要做的是选择uri,然后单击名称uri,以打开新页面
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import re
import pandas as pd
import os
url = "https://www.codechef.com/ratings/all?order=asc&page=3&sortBy=global_rank"
# create a new Firefox session
driver = webdriver.Firefox()
driver.get(url)
soup_level1=BeautifulSoup(driver.page_source, 'lxml')
datalist = [] #empty list
x = 488
for link in soup_level1.find_all('a', id=re.compile(r"^ember")):
elemnt2222=driver.find_element_by_xpath("//*[@id='ember"+str(493)+"']/td[2]/div[2]/a")
python_button = elemnt2222
python_button.click() #click link
答案 0 :(得分:1)
您不必单击。只需从锚点获取href。导航到该URL。
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import re
import pandas as pd
import os
import time
url = "https://www.codechef.com/ratings/all?order=asc&page=3&sortBy=global_rank"
driver = webdriver.Firefox()
driver.get(url)
#Takes some time to load
time.sleep(5)
soup=BeautifulSoup(driver.page_source, 'lxml')
links = soup.select('div.user-name > a')
for link in links:
print(link.get('href'))
这将为您提供结果
/users/sumeet_varma
/users/fjzzq2002
/users/dreamoon4
/users/y0105w49
/users/nblt
/users/dzhulgakov
/users/uwi
/users/Fcdkbear
/users/austin990301
/users/KADR
/users/adkroxx
/users/kostroma
/users/fhlasek
/users/argos
/users/watcher
/users/nafis
/users/scli
/users/mister
/users/iwiwi
/users/aurinegro
之后,您可以导航到