我正在尝试从<button data-availability-id="8W1VZ0Q60RBW" .......>
标记内部抓取信息,假设我可以使用已知的数据可用性ID对其进行定位。我需要data-title和data-button-title的值,这意味着笔记本电脑的标题和当前配置,均来自同一button标记内。如何使用BeautifulSoup来获得它?
<button data-m="{"cN":"Sku","pid":"8N4K86D4J006/HB3R/8W1VZ0Q60RBW","id":"nn1c27c1c3c1m1r2a2","sN":1,"aN":"c27c1c3c1m1r2a2"}" class="c-select-button cli_sku-select-button" name="sku" aria-pressed="false" data-js-selected-text="Intel i5, 256GB SSD Selected" data-js-unselected-text="Intel i5, 256GB SSD Not selected" data-bundleonly="false" data-comingsoon="false" data-notavailable="false" data-sku-id="HB3R" data-availability-id="8W1VZ0Q60RBW" data-title="Huawei Matebook X Pro 53010CBS Laptop" data-button-title="Intel i5, 256GB SSD" data-purchasable="true" data-repurchasable="true" data-trial="false" data-delivery-overlay="" data-inventory-sku-id="QF9-01635" data-inventoried="true" data-usecart="true" data-purchase-method="cart" data-device-serial-number="" data-preorder="false" data-preorder-release-date="" data-cta-display-type="Default" data-max-order-quantity="" data-imageuri="https://img-prod-cms-rt-microsoft-com.akamaized.net/cms/api/am/imageFileData/RE267fU?ver=0ea4&m=6&w=72&h=72&n=t&q=60&o=f&l=t&b=white" data-imagealtext="No Data Available" data-show-findinstore="true" data-in-stock="" data-list-price="1349.1" data-rt-price="1499" data-formatted-list-price="CAD $1,349.10" data-bi-dnt="" data-bi-mto="" aria-checked="false">
Intel i5, 256GB SSD
<span class="cli_sku_price_acc x-hidden" aria-hidden="false">CAD $1,349.10</span>
答案 0 :(得分:1)
一旦有了BeautifulSoup对象,只需找到按钮并提取所需的属性即可:
button = soup.find('button', {"data-availability-id":'8W1VZ0Q60RBW'})
button['data-title']
#'Huawei Matebook X Pro 53010CBS Laptop'
button['data-button-title']
#'Intel i5, 256GB SSD'