通过xpath找到元素并单击后如何抓取网页

时间:2019-02-08 02:02:37

标签: python-3.x selenium-webdriver beautifulsoup selenium-chromedriver

from selenium import webdriver

from bs4 import BeautifulSoup
driver = webdriver.Chrome(r"C:\Users\Matang\Desktop\chromedriver_win32 (1)\chromedriver.exe")
driver.get("https://turo.com/search?airportCode=EWR&customDelivery=true&defaultZoomLevel=11&endDate=04%2F05%2F2019&endTime=11%3A00&international=true&isMapSearch=false&itemsPerPage=200&location=EWR&locationType=Airport&maximumDistanceInMiles=30&sortType=RELEVANCE&startDate=03%2F05%2F2019&startTime=10%3A00")
driver.find_element_by_xpath("""//*[@id="pageContainer-content"]/div[4]/div/div[1]/div[2]/div[1]/div/div/div[1]/div/div[1]/div/div/a""").click()

我想获取上面xpath的页面信息并将其提取 我总是得到网址信息 请帮助某人

url是 https://turo.com/search?airportCode=EWR&customDelivery=true&defaultZoomLevel=11&endDate=04%2F05%2F2019&endTime=11%3A00&international=true&isMapSearch=false&itemsPerPage=200&location=EWR&locationType=Airport&maximumDistanceInMiles=30&sortType=RELEVANCE&startDate=03%2F05%2F2019&startTime=10%3A00

1 个答案:

答案 0 :(得分:0)

单击后,只需获取html源然后进行解析即可。您可以使用Selenium来做到这一点,或者我更喜欢BeautifulSoup,因为我对此比较熟悉。因此,您可以将代码放在这里:

from selenium import webdriver


from bs4 import BeautifulSoup
driver = webdriver.Chrome(r"C:\Users\Matang\Desktop\chromedriver_win32 (1)\chromedriver.exe")
driver.get("https://turo.com/search?airportCode=EWR&customDelivery=true&defaultZoomLevel=11&endDate=04%2F05%2F2019&endTime=11%3A00&international=true&isMapSearch=false&itemsPerPage=200&location=EWR&locationType=Airport&maximumDistanceInMiles=30&sortType=RELEVANCE&startDate=03%2F05%2F2019&startTime=10%3A00")
driver.find_element_by_xpath("""//*[@id="pageContainer-content"]/div[4]/div/div[1]/div[2]/div[1]/div/div/div[1]/div/div[1]/div/div/a""").click()

soup = BeautifulSoup(driver.page_source, 'html.parser')

# Start finding and grabbing the tags and elements in `soup`