我编写以下代码从网页中提取价格:
from urllib.request import urlopen
from bs4 import BeautifulSoup
url = "https://www.teleborsa.it/azioni/intesa-sanpaolo-isp-it0000072618-SVQwMDAwMDcyNjE4"
html = urlopen(url)
soup = BeautifulSoup(html,'lxml')
prize = soup.select('.h-price')
print(prize)
输出为:
<span class="h-price fc0" id="ctl00_phContents_ctlHeader_lblPrice">1,384</span>
我想提取1,384个值。
答案 0 :(得分:0)
尝试一下
document.getElementById("ctl00_phContents_ctlHeader_lblPrice").innerText
或者,如果您具有动态元素,则可以遍历每个元素并从中获取innerText。
答案 1 :(得分:0)
您可以使用.text
属性来获取所需的文本。
例如:
from urllib.request import urlopen
from bs4 import BeautifulSoup
url = "https://www.teleborsa.it/azioni/intesa-sanpaolo-isp-it0000072618-SVQwMDAwMDcyNjE4"
html = urlopen(url)
soup = BeautifulSoup(html,'lxml')
prize = soup.select_one('.h-price') # <- change to .select_one() to get only one element
print(prize.text) # <- use the .text property to get text of the tag
打印:
1,384