如何从下面的HTML代码中提取值“ 1.00 TK = 779.8 ”?
我尝试了下面的代码,但是没有用;
from bs4 import BeautifulSoup
page = requests.get(<url>).text
##here is the html page content'''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''
soup = BeautifulSoup(html, 'html.parser')
print(soup.find(id='driveValue').find_next(text=True).strip())
错误:
AttributeError: 'NoneType' object has no attribute 'find_next'
答案 0 :(得分:0)
使用find_next()
,它返回第一个匹配项:
from bs4 import BeautifulSoup
html = '''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''
soup = BeautifulSoup(html, 'html.parser')
print(soup.find(id='driveValue').find_next(text=True).strip())
输出:
1.00 TK = 779.8
修改:使用Selenium:
from bs4 import BeautifulSoup
from selenium import webdriver
from time import sleep
URL = "https://www.westernunion.com/us/en/web/send-money/start?SrcCode=12345&ReceiveCountry=IN&SendAmount=100&ISOCurrency=CNY&FundsOut=BA&FundsIn=CreditCard"
driver = webdriver.Chrome(r"C:\path\to\chromedriver.exe")
driver.get(URL)
sleep(10)
soup = BeautifulSoup(driver.page_source, "html.parser")
price = driver.find_element_by_css_selector("span.ng-binding.ng-scope").text
print(price)
driver.quit()
输出:
1.00 USD = 73.9375 Indian Rupee (INR)
答案 1 :(得分:-2)
希望它的帮助。
from lxml import etree
txt = '''<span _ngcontent-his-c101="" id="driveValue" class="ng-binding ng-scope"> 1.00 TK = 779.8<span _ngcontent-his-c101="">Disk Drive Value</span>(DDV) </span>'''
root = etree.fromstring(txt)
for td in root.xpath('//span[contains(@class, "ng-binding ng-scope")]'):
print(td.text)
打印输出
1.00 TK = 779.8