Selenium从Angular网站返回空字符串

时间:2017-07-13 22:37:14

标签: angularjs selenium web-scraping phantomjs python-3.5

我正试图在http://stats.nba.com/player/#!/76124/career/(在它所说的地方' Born')下生日,但生日是动态生成的,所以BeautifulSoup无法获得它。

Image of the NBA.com player profile that Im scraping

我正在尝试Selenium,这是我使用的代码:

driver.get(url)
sleep(5)
e = driver.find_elements_by_class_name('player-stats__stat-value')
for a in e:
    print(a.get_attribute('innerHTML'))
driver.close()

打印出空行。如果我去Inspect-> Network-> XHR->这就是html的样子:响应:

<span class="player-stats__stat-value" itemprop="birthDate">{{ playerInfo.BIRTHDATE | date:'M/d/yy' }}</span>

Selenium能否返回{{ playerInfo.BIRTHDATE | date:'M/d/yy' }}的实际值,若然,又如何?

2 个答案:

答案 0 :(得分:0)

以下是您的问题的答案:

e = driver.find_elements_by_xpath("//div[@class='summary']//span[@class='player-stats__stat-value' and @itemprop='birthDate']")
for a in e:
    print(a.get_attribute('innerHTML'))
driver.close()
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path="C:\\Utility\\BrowserDrivers\\chromedriver.exe")
driver.get('http://stats.nba.com/player/#!/76124/career/')
driver.implicitly_wait(15)
e = driver.find_elements_by_xpath("//div[@class='summary']//span[@class='player-stats__stat-value' and @itemprop='birthDate']")
for a in e:
    print(a.get_attribute('innerHTML'))
driver.close()

如果这回答你的问题,请告诉我。

答案 1 :(得分:-1)

我有3个每个包含网格(p表)的脉管突片。 GetText()返回空白,getAttribute(“ innerHTML”)从相邻标签页中的网格返回相应单元格的值