Question

如何从下面的HTML代码中提取值“ 1.00 TK = 779.8 ”？

我尝试了下面的代码，但是没有用；

from bs4 import BeautifulSoup
page = requests.get(<url>).text

##here is the html page content'''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''

soup = BeautifulSoup(html, 'html.parser')
print(soup.find(id='driveValue').find_next(text=True).strip())

错误：

 AttributeError: 'NoneType' object has no attribute 'find_next'

Answer 1

使用find_next()，它返回第一个匹配项：

from bs4 import BeautifulSoup

html = '''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''

soup = BeautifulSoup(html, 'html.parser')
print(soup.find(id='driveValue').find_next(text=True).strip())

输出：

1.00 TK = 779.8

修改：使用Selenium：

from bs4 import BeautifulSoup
from selenium import webdriver
from time import sleep

URL = "https://www.westernunion.com/us/en/web/send-money/start?SrcCode=12345&ReceiveCountry=IN&SendAmount=100&ISOCurrency=CNY&FundsOut=BA&FundsIn=CreditCard"

driver = webdriver.Chrome(r"C:\path\to\chromedriver.exe")
driver.get(URL)
sleep(10)

soup = BeautifulSoup(driver.page_source, "html.parser")

price = driver.find_element_by_css_selector("span.ng-binding.ng-scope").text
print(price)

driver.quit()

输出：

1.00 USD = 73.9375 Indian Rupee (INR)

Answer 2

希望它的帮助。

from lxml import etree
txt = '''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''

root = etree.fromstring(txt)
for td in root.xpath('//span[contains(@class, "ng-binding ng-scope")]'):
    print(td.text)

打印输出

1.00 TK = 779.8

使用Python Beautiful Soup在Web抓取中提取价值

2 个答案: