Question

我正在尝试从此网站上抓取数据：https://www.dailyfx.com/sentiment 例如，我想知道有多少％的客户持有欧元/美元多头，但是我无法在span标签中得到文本。

我正在尝试获得61％

<span class="bullish-color jsdfx-sentiment-long" style="font-size: 15px;">61%</span>

'''

import bs4, requests

dailyfxSite = 'https://www.dailyfx.com/sentiment'

res = requests.get(dailyfxSite)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text, 'html.parser')

span = soup.find("span", class_="bullish-color jsdfx-sentiment-long")
print(span)

'''

我得到这个作为回报：

<span class="bullish-color jsdfx-sentiment-long" style="font-size:15px;"> </span>

'''

我需要的61％以外的所有东西

Answer 1

问题似乎在于如何在网站上创建数据。网站好像正在使用JavaScript加载数据（即span元素是使用JavaScript动态创建的）。 requests不适用于JavaScript动态创建的元素。

我建议使用硒打开网站：

from bs4 import BeautifulSoup
import requests
from selenium import webdriver

url = "https://www.dailyfx.com/sentiment"
browser = webdriver.Chrome(executable_path="/usr/local/bin/chromedriver")
browser.get(url)
soup = BeautifulSoup(browser.page_source, features="html.parser")
a = soup.find("span", {"class": "bullish-color"})
print(a.text)

输出：

61%

您可能需要一种不同的方式来创建browser（上面显示的方法在给定某些自定义配置的情况下可以在macOS上运行）。看看如何在平台上使用Selenium创建浏览器。

Answer 2

尝试使用Css Selector来实现。

from bs4 import BeautifulSoup

html='''<span class="bullish-color jsdfx-sentiment-long" style="font-size: 15px;">61%</span>'''
soup=BeautifulSoup(html,'html.parser')
print(soup.select_one("span.bullish-color.jsdfx-sentiment-long").text)

输出：

61%

在span标签内找不到元素

2 个答案: