我从URL获得了以下HTML:
<h4>
\r\n \r\n\r\n
<a href="/l">
\r\n <!-- mp_trans_rt_start id="1" args="as" 1 -->\r\n <span class="brandWrapTitle">\r\n <span class="productdescriptionbrand">Mxxx</span>\r\n </span>\r\n <span class="nameWrapTitle">\r\n <span class="productdescriptionname">Axxxname</span>\r\n </span>\r\n <!-- mp_trans_rt_end 1 -->\r\n
</a>
\r\n\r\n
</h4>
我试图使用python查找类名:
import urllib.request
from bs4 import BeautifulSoup
url = "https://link"
user_agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
urlwithagent = urllib.request.Request(url,headers={'User-Agent': user_agent})
response = urllib.request.urlopen(urlwithagent)
soup = response.read()
product = soup.find("h4", attrs ={"class=": "productdescriptionname"})
print (product)
一切都完美,直到行:
product = soup.find("h4", attrs ={"class=": "productdescriptionname"})
出现类似错误
find() takes no keyword arguments
我不知道如何解决它-周围有很多信息,但是没有用:/
答案 0 :(得分:3)
在使用BeautifulSoup
之前,您需要将其转换为find
对象,否则它将使用str.find
例如:
soup = BeautifulSoup(response.read(), "html.parser")
product = soup.find("h4", attrs ={"class": "productdescriptionname"})
print (product)