Question

我正在尝试提取此产品的信息，但此<div>标记中的可用性未被提取。相反，只提取nbsp，但代码适用于其他标记。这是提取的代码。（使用beautifulsoup）

try:
  s4 = soup.find(id='availability').find_all(text=True, recursive=False)
  print(s4)
except:
    print('no availability')

以下是该网页的网址：https://www.amazon.in/Redmi-4-Gold-16-GB/dp/B01LYX8UPN

...在运行代码而不是＆＃39;库存＆＃39;后，我只得到[\ n \ n]。

Answer 1

如果匹配不是＆＃39;没有＆＃39;你必须首先检查像这样的东西：

import requests as tt
from bs4 import BeautifulSoup

get_url=tt.get("https://www.amazon.in/Redmi-4-Gold-16-GB/dp/B01LYX8UPN")  #opening the url
soup=BeautifulSoup(get_url.text,"html.parser")  #parsing the content of page to html parser
sub_class=soup.find("div", {"id": "availability"})  #finding that particular div 
if sub_class:   #above result can be none so checking if result is not none
    print("availability : {}".format(sub_class.find("span", {"class": "a-size-medium a-color-success"}).text.strip()))  #if result is not none then finding sub class which is span and getting the text of span

输出：

availability : In stock.

如何在python的amazon网站的这个标签中提取可用性（库存）？

1 个答案: