我正在尝试提取此产品的信息,但此<div>
标记中的可用性未被提取。相反,只提取nbsp
,但代码适用于其他标记。这是提取的代码。 (使用beautifulsoup
)
try:
s4 = soup.find(id='availability').find_all(text=True, recursive=False)
print(s4)
except:
print('no availability')
以下是该网页的网址:https://www.amazon.in/Redmi-4-Gold-16-GB/dp/B01LYX8UPN
...在运行代码而不是&#39;库存&#39;后,我只得到[\ n \ n]。
答案 0 :(得分:1)
如果匹配不是&#39;没有&#39;你必须首先检查像这样的东西:
import requests as tt
from bs4 import BeautifulSoup
get_url=tt.get("https://www.amazon.in/Redmi-4-Gold-16-GB/dp/B01LYX8UPN") #opening the url
soup=BeautifulSoup(get_url.text,"html.parser") #parsing the content of page to html parser
sub_class=soup.find("div", {"id": "availability"}) #finding that particular div
if sub_class: #above result can be none so checking if result is not none
print("availability : {}".format(sub_class.find("span", {"class": "a-size-medium a-color-success"}).text.strip())) #if result is not none then finding sub class which is span and getting the text of span
输出:
availability : In stock.