Question

我正在抓取Google搜索结果。但是，在执行此操作时，我反复收到SyntaxError。这是代码：

import urllib.request
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/70.0'

url = "https://www.google.com/search?hl=en&q=python+wikipedia"
headers={'User-Agent':user_agent,} 

request=urllib.request.Request(url,None,headers) #The assembled request
response = urllib.request.urlopen(request)
data = response.read()

soup= BeautifulSoup(data, 'html.parser')
l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})
print(l)

我明白了：

SyntaxError：关键字不能是表达式

在l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})行中

。有人可以告诉我我在做什么错吗？

Answer 1

attrs周围不应包含撇号：

l = soup.find_all('h' ,   attrs  = {"class":'LC20lb'})
# not:                   _     _
#l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})    
#                        ^     ^

Answer 2

import urllib.request
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/70.0'

url = "https://www.google.com/search?hl=en&q=python+wikipedia"
headers={'User-Agent':user_agent,}

request=urllib.request.Request(url,None,headers) #The assembled request
response = urllib.request.urlopen(request)
data = response.read()

soup= BeautifulSoup(data, 'html.parser')
l = soup.find_all('h',  {"class":'LC20lb'})
print(l)

使用BeautifulSoup抓取Google时出现SyntaxError

2 个答案: