我刚开始一个python网络课程,我试图使用BeautifulSoup解析HTML数据,我遇到了这个错误。我研究过但无法找到任何精确和确定的解决方案。所以这是一段代码:
a = a * (str.at(c + d) - '0');
这就是我面临的错误:
import requests
from bs4 import BeautifulSoup
request = requests.get("http://www.johnlewis.com/toms-berkley-slipper-grey/p3061099")
content = request.content
soup = BeautifulSoup(content, 'html.parser')
element = soup.find(" span", {"itemprop ": "price ", "class": "now-price"})
string_price = (element.text.strip())
print(int(string_price))
# <span itemprop="price" class="now-price"> £40.00 </span>
任何帮助将不胜感激
答案 0 :(得分:2)
问题是标签名称,属性名称和属性值中的额外空格字符,替换为:
element = soup.find(" span", {"itemprop ": "price ", "class": "now-price"})
使用:
element = soup.find("span", {"itemprop": "price", "class": "now-price"})
之后,转换字符串时还需要解决两件事:
£
字符float()
代替int()
修正版:
element = soup.find("span", {"itemprop": "price", "class": "now-price"})
string_price = (element.get_text(strip=True).lstrip("£"))
print(float(string_price))
您会看到40.00
已打印。
答案 1 :(得分:0)
您也可以尝试使用css选择器:
import requests
from bs4 import BeautifulSoup
request = requests.get("http://www.johnlewis.com/toms-berkley-slipper-grey/p3061099")
content = request.content
soup = BeautifulSoup(content, 'html.parser')
# print soup
element = soup.select("div p.price span.now-price")[0]
print element
string_price = (element.text.strip())
print(int(float(string_price[1:])))
输出:
<span class="now-price" itemprop="price">
£40.00
</span>
40