使用beautifulsoup查找html元素时遇到问题

时间:2020-07-09 18:22:30

标签: python beautifulsoup

我有以下页面-https://www.medline.com/sku/item/MDPMDS89740KDMB?skuIndex=S1&question=&flowType=browse&indexCount=1 我想提取表中给出的制造商名称。我写了下面的代码来获取它,但是路径似乎不正确。我不确定如何从表中仅获取特定的td值。

def name(url):
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text, 'html.parser')
    table= soup.find("table", {"class":"medSKUTableDetails"})
    mnf= table.find('td')
    if mnf:
        print(mnf.string)
    else:
        print("issue")
    

1 个答案:

答案 0 :(得分:3)

您可以使用文本<td>搜索"Manufacturer"标签,然后再查找下一个具有制造商名称的<td>

例如:

import requests
from bs4 import BeautifulSoup


url = 'https://www.medline.com/sku/item/MDPMDS89740KDMB?skuIndex=S1&question=&flowType=browse&indexCount=1'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

manufacturer = soup.select_one('td:contains("Manufacturer")')
if manufacturer:
    manufacturer = manufacturer.find_next('td').text
else:
    manufacturer = 'Not Found'

print(manufacturer)

打印:

Medline