Question

所以我试图用以下代码来刮擦表的内容：

url = 'https://www.eleconomista.es/indices-mundiales/'
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data,"html5lib")
table=soup.find('table',{'class' : 'table tableFlex table-striped footable footable-1 breakpoint breakpoint-xs'})
print( table)

输出为

None

但是我想打印表“ Europa”

我想了解为什么我没有收到期望的输出，以及如何在将来的情况下解决此问题。

Answer 1

您之所以看到None，是因为该页面使用了沉重的Javascript并动态更改了标记的类-您在浏览器中看到的类与从requests获得的类不同。因此，字符串'table tableFlex table-striped footable footable-1 breakpoint breakpoint-xs'不会捕获任何内容。您可以尝试使用此脚本捕获一些数据（数据表中只有<tr>个标记，因此可以选择它们）：

import requests
from bs4 import BeautifulSoup

url = 'https://www.eleconomista.es/indices-mundiales/'
r = requests.get(url)
data = r.text
soup = BeautifulSoup(data, "html5lib")

rows = []
for tr in soup.select('tr'):
    row = [td.get_text(strip=True) for td in tr.select('td')]
    if row:
        rows.append(row)

for row in rows:
    print(''.join('{: <20}'.format(d) for d in row))

打印：

IBEX 35             9.170,50                                -0,60%              -55,20              9.225,70            19/07               
BEL 20              3.652,42                                +0,88%              +31,71              3.620,71            19/07               
DAX                 12.260,07                               +0,26%              +32,22              12.227,85           19/07               
CAC 40              5.552,34                                +0,03%              +1,79               5.550,55            19/07               
FTSE 100            7.508,70                                +0,21%              +15,61              7.493,09            19/07               
PSI 20              5.202,23                                -0,35%              -18,36              5.220,59            19/07               
EURO STOXX 50®      3.480,18                                -0,08%              -2,65               3.482,83            19/07               
ECO10               125,89                                  +0,03%              +0,04               125,85              19/07               
FTSE MIB INDEX      22.209,75                               +0,90%              +197,99             22.011,76           27/03               
DOW JONES           27.154,20                               -0,25%              -68,77              27.222,97           19/07               
NASDAQ 100          7.834,90                                -0,88%              -69,24              7.904,13            19/07               
S P 500             2.976,61                                -0,62%              -18,50              2.995,11            19/07               
NASDAQ COMPOSITE    8.146,49                                -0,74%              -60,75              8.207,24            19/07               
NIKKEI 225          21.456,10                               +1,95%              +410,38             21.045,72           19/07               
IPC MEXICO          41.606,54                               -0,03%              -11,57              42.551,54           19/07               
Merval              40.161,60                               -1,45%              -591,15             41.451,31           19/07               
IPSA                3.625,61                                +0,23%              +8,35               3.624,20            19/07               
LIMA INDICE GENERAL 20.845,29                               -0,36%              -74,36              20.839,30           19/07               
IGBC                13.762,88                               -0,21%              -29,61              13.792,49           4/06

Answer 2

该表具有“ table tableFlex table-striped”类。所以下面的方法会起作用

soup.find('table',{'class' : 'table tableFlex table-striped'})

尝试使用bs4抓取表的内容

2 个答案: