我正在尝试对 过去 24 小时 的 Google 结果进行网络抓取,但我没有通过过滤器获得正确的网址。有人可以帮我吗?
正确的网址是什么?
我使用的代码:
from bs4 import BeautifulSoup
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
# Query to obtain links
query = 'VLI Logística'
links = [] # Initiate empty list to capture final results# Specify number of pages on google search, each page contains 10 #links
n_pages = 40
for page in range(1, n_pages):
url = "http://www.google.com/search?q=" + query + "&start=" + str((page - 1) * 10)
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
# soup = BeautifulSoup(r.text, 'html.parser')
search = soup.find_all('div', class_="yuRUbf")
for h in search:
links.append(h.a.get('href'))