使用BeautifulSoup抓取Google时出现SyntaxError

时间:2019-10-29 06:16:25

标签: python python-3.x beautifulsoup

我正在抓取Google搜索结果。但是,在执行此操作时,我反复收到SyntaxError。这是代码:

import urllib.request
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/70.0'

url = "https://www.google.com/search?hl=en&q=python+wikipedia"
headers={'User-Agent':user_agent,} 

request=urllib.request.Request(url,None,headers) #The assembled request
response = urllib.request.urlopen(request)
data = response.read()

soup= BeautifulSoup(data, 'html.parser')
l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})
print(l)

我明白了:

  
    

SyntaxError:关键字不能是表达式

  
l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})行中

。有人可以告诉我我在做什么错吗?

2 个答案:

答案 0 :(得分:1)

attrs周围不应包含撇号:

l = soup.find_all('h' ,   attrs  = {"class":'LC20lb'})
# not:                   _     _
#l = soup.find_all('h' , 'attrs' = {"class":'LC20lb'})    
#                        ^     ^

答案 1 :(得分:1)

import urllib.request
from bs4 import BeautifulSoup
user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/70.0'

url = "https://www.google.com/search?hl=en&q=python+wikipedia"
headers={'User-Agent':user_agent,}

request=urllib.request.Request(url,None,headers) #The assembled request
response = urllib.request.urlopen(request)
data = response.read()

soup= BeautifulSoup(data, 'html.parser')
l = soup.find_all('h',  {"class":'LC20lb'})
print(l)