Question

我在此网站上无法获得全部搜索结果： https://www.gasbuddy.com/home?search=67401&fuel=1 此链接是我遇到问题的搜索结果之一。问题在于它仅显示前10个结果（我知道，这是一个常见问题，已经在stackoverflow的多个线程中进行了介绍-但其他地方找到的解决方案在这里无效。）该页面的html似乎是由javascript函数生成的，该函数并未将所有结果都嵌入到该页面中。我尝试使用一个函数来访问“更多汽油价格”按钮中提供的链接，但这也无法产生完整的结果。有没有办法访问此完整列表，还是我不走运？

这是我用来获取信息的Python：

# Gets the prices from gasbuddy based on the zip code.
def get_prices(zip_code, store): 
    search = zip_code
    # Establishes the search params to be passed to the website.
    params ={'search': search, 'fuel': 1}
    # Contacts website and make the search.  
    r = requests.get('https://www.gasbuddy.com/home', params=params, cookies={'DISPLAYNUM': '100000000'}) 
    # Turn the results of the above into Beautiful Soup object.
    soup = BeautifulSoup(r.text, 'html.parser') 
    # Searches out the div that contains the gas station information.
    results = soup.findAll('div', {'class': 'styles__stationListItem___xKFP_'})

Answer 1

使用selenium。设置工作有些繁琐，但这听起来就像您需要的那样。

Here我用它来单击网站的“显示更多”按钮。在我的确切项目中查看更多内容。

from selenium import webdriver
url = 'https://www.gofundme.com/discover'
driver = webdriver.Chrome('C:/webdriver/chromedriver.exe')
driver.get(url)
for elem in driver.find_elements_by_link_text('Show all categories'):
        try:
            elem.click()
            print('Succesful click')
        except:
            print('Unsuccesful click')

source = driver.page_source

driver.close()

因此，基本上，您需要找到需要单击以显示更多信息的元素名称，或者您需要使用网络驱动程序来向下滚动网页。

Python Web Scraper-由页面JavaScript

1 个答案: