我正在尝试从here抓取诸如资产类别,类别,潜在风险(数字而不是链接中显示的图像)之类的信息。网页加载后的源代码显示为
<div data-ng-class="{layerLinkRight : data.isPriceYieldSecLayerLink, wraper : isETF, secYieldDataWrapper : !data.isLayer && data.codeIsLayer && isETF}" class="wraper">
<!-- ngIf: !data.isLayer --><span data-ng-if="!data.isLayer" data-ng-bind-html="data.value" data-ng-class="{sceIsLayer : isETF, arrange : isMutualFund, arrangeSec : isETF}" class="ng-scope ng-binding sceIsLayer arrangeSec">Asset class</span><!-- end ngIf: !data.isLayer -->
<!-- ngIf: data.isLayer -->
<!-- ngIf: !data.codeIsLayer --><span data-ng-if="!data.codeIsLayer" data-ng-class="{sceIsLayer : isETF}" data-ng-bind-html="data.codeValue" class="ng-scope ng-binding sceIsLayer"></span><!-- end ngIf: !data.codeIsLayer -->
<!-- ngIf: data.codeIsLayer -->
</div>
我从获取图像的基础开始,并尝试使用代码捕获“资产类别”值
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import os
url="https://investor.vanguard.com/etf/profile/BNDW/"
#web_r=requests.get(url)
#web_soup=BeautifulSoup(web_r.text,'html.parser')
#<img src=''/
driver = webdriver.Firefox()#executable_path=r'/home/suraj/Documents/python-virtual-environments/stock_analysis/Files/geckodriver.exe')
driver.get(url)#"http://www.chrisburkard.com/")
html=driver.execute_script("return document.documentElement.outerHTML")
sel_soup=BeautifulSoup(html,'html.parser')
print(len(sel_soup.findAll("img")))
#Getting sample Images
images=[]
for i in sel_soup.findAll("img"):
#print(i)
#print(dir(i))
src = i["src"]
images.append(src)
print(images)
asset_class=[]
#Getting Asset Class
for i in sel_soup.findAll("span data-ng-if"):
a_class=i["data.codeValue"]
asset_class.append(a_class)
print(asset_class)
asset_class的输出为空。