美丽的汤元素未发现异常

时间:2018-07-19 16:34:44

标签: python-3.x beautifulsoup web-crawler

我正在使用Python 3.6和Beautiful Soup制作Crawler。这是我的代码

当我运行它时,我找不到元素异常,为什么?我想要做的是选择uri,然后单击名称uri,以打开新页面

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import re
import pandas as pd
import os
url = "https://www.codechef.com/ratings/all?order=asc&page=3&sortBy=global_rank"


# create a new Firefox session
driver = webdriver.Firefox()
driver.get(url)
soup_level1=BeautifulSoup(driver.page_source, 'lxml')

datalist = [] #empty list
x = 488 
for link in soup_level1.find_all('a', id=re.compile(r"^ember")):

    elemnt2222=driver.find_element_by_xpath("//*[@id='ember"+str(493)+"']/td[2]/div[2]/a")

    python_button = elemnt2222
    python_button.click() #click link

1 个答案:

答案 0 :(得分:1)

您不必单击。只需从锚点获取href。导航到该URL。

from selenium import webdriver
from selenium.webdriver.common.keys import Keys    
from bs4 import BeautifulSoup
import re
import pandas as pd
import os
import time

url = "https://www.codechef.com/ratings/all?order=asc&page=3&sortBy=global_rank"
driver = webdriver.Firefox()
driver.get(url)
#Takes some time to load
time.sleep(5)
soup=BeautifulSoup(driver.page_source, 'lxml')

links = soup.select('div.user-name > a')
for link in links:
  print(link.get('href'))

这将为您提供结果

/users/sumeet_varma
/users/fjzzq2002
/users/dreamoon4
/users/y0105w49
/users/nblt
/users/dzhulgakov
/users/uwi
/users/Fcdkbear
/users/austin990301
/users/KADR
/users/adkroxx
/users/kostroma
/users/fhlasek
/users/argos
/users/watcher
/users/nafis
/users/scli
/users/mister
/users/iwiwi
/users/aurinegro

之后,您可以导航到

  

https://www.codechef.com/users/username