Question

我首先要说的是我在本网站上查看了几个解决方案，但似乎没有一个对我有用。

我只是想从这个网站访问div标签的内容：https://play.spotify.com/chart/3S3GshZPn5WzysgDvfTywr，但内容没有显示。

这是我到目前为止的代码：

SpotifyGlobViralurl='https://play.spotify.com/chart/3S3GshZPn5WzysgDvfTywr'
browser.get(SpotifyGlobViralurl)
page = browser.page_source
soup = BeautifulSoup(page)
#the div contents exist in an iframe, so now we call the iframe contents of the 3rd iframe on page:
iFrames=[] 
iframexx = soup.find_all('iframe')
response = urllib2.urlopen(iframexx[3].attrs['src'])
iframe_soup = BeautifulSoup(response)
divcontents = iframe_soup.find('div', id='main-container')

我试图拉出'main-container'div的内容，但正如您将看到的，当存储在创建的divcontent变量中时，它显示为空。但是，如果您访问实际的URL并检查元素，您会发现这个'main-container'div语句中包含了所有内容。

我很感激帮助。

Answer 1

那是因为容器是动态加载的。我注意到您正在使用selenium，您必须继续使用它，切换到iframe并等待main-container加载：

wait = WebDriverWait(browser, 10)

# wait for iframe to become visible
iframe = wait.until(EC.visibility_of_element_located((By.XPATH, "//iframe[starts-with(@id, 'browse-app-spotify:app:chart:')]")))
browser.switch_to.frame(iframe)

# wait for header in the container to appear
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#main-container #header")))

container = browser.find_element_by_id("main-container")

Python中的BeautifulSoup - DIV内容未显示

1 个答案: