使用过去 24 小时的过滤器从网络抓取 Google 结果?

时间:2021-06-07 23:16:49

标签: python dataframe web-scraping data-science data-analysis

我正在尝试对 过去 24 小时 的 Google 结果进行网络抓取,但我没有通过过滤器获得正确的网址。有人可以帮我吗?

正确的网址是什么?

我使用的代码:

from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
# Query to obtain links
query = 'VLI Logística'
links = [] # Initiate empty list to capture final results# Specify number of pages on google search, each page contains 10 #links
n_pages = 40 
for page in range(1, n_pages):
    url = "http://www.google.com/search?q=" + query + "&start=" + str((page - 1) * 10)
    driver.get(url)
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    # soup = BeautifulSoup(r.text, 'html.parser')

    search = soup.find_all('div', class_="yuRUbf")
    for h in search:
        links.append(h.a.get('href'))

0 个答案:

没有答案