我正在尝试抓取网页https://www.bolsadeproductos.cl/pagador/20,以获取底部表格。但是在使用下一个代码时,我无法获得所有结果,只有前10行。如何遍历所有不同的标签?
from selenium import webdriver
from bs4 import BeautifulSoup
import time
import pandas as pd
driver = webdriver.Edge('C:\\Users\\facun\\Documents\\msedgedriver.exe')
driver.get('https://www.bolsadeproductos.cl/pagador/20')
df = pd.read_html(driver.page_source, attrs = {'id': 'tbl_export'})
谢谢。
答案 0 :(得分:0)
这很简单,只需在代码中替换以下行即可:
driver.get('https://www.bolsadeproductos.cl/pagador/20')
到
driver.get('https://www.bolsadeproductos.cl/pagador/tablePagador/20/undefined/0')
df = pd.read_html(driver.page_source, attrs = {'id': 'tbl_export'})
print(df)
输出:
[ Fecha Operacion Nemotecnico Vendedor Comprador Monto Tasa Plazo(Dias)
0 30-06-2015 FANGLOS LV LV $ 26.586.879 0,34% 30
1 26-06-2015 FANGLOS LV LV $ 26.574.872 0,34% 34
2 27-05-2015 FANGLOS LV LV $ 1.059.184.359 0,34% 16
3 16-06-2015 FANGLOS LV LV $ 996.461.527 0,34% 37
4 16-06-2015 FANGLOS LV LV $ 996.461.527 0,34% 37
.. ... ... ... ... ... ... ...
309 03-03-2020 FANGLOS LV LV $ 8.558.358 0,26% 13
310 06-03-2020 FANGLOS LV LV $ 8.560.581 0,26% 10
311 06-03-2020 AANGLOS LV LV $ 63.596.531 0,26% 59
312 06-03-2020 FANGLOS LV BCI $ 45.678.549 0,26% 31
313 19-05-2020 FANGLOS BCI BCI $ 849.422.583 0,22% 17
答案 1 :(得分:0)
通过JavaScript动态加载数据,但是您可以使用requests
模块获取结果:
import requests
from bs4 import BeautifulSoup
url = 'https://www.bolsadeproductos.cl/pagador/tablePagador/20/undefined/0'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for i, row in enumerate(soup.select('tr:has(td)'), 1):
row = [td.get_text(strip=True) for td in row.select('td')]
print('{:<5}{:<15}{:<15}{:<10}{:<10}{:<20}{:<15}{:<15}'.format(i, *row))
打印:
1 30-06-2015 FANGLOS LV LV $ 26.586.879 0,34% 30
2 26-06-2015 FANGLOS LV LV $ 26.574.872 0,34% 34
3 27-05-2015 FANGLOS LV LV $ 1.059.184.359 0,34% 16
4 16-06-2015 FANGLOS LV LV $ 996.461.527 0,34% 37
5 16-06-2015 FANGLOS LV LV $ 996.461.527 0,34% 37
6 27-05-2015 FANGLOS LV LV $ 1.059.184.359 0,34% 16
... all the way to:
309 23-12-2019 FANGLOS LV BCI $ 193.475.303 0,26% 56
310 03-03-2020 FANGLOS LV LV $ 8.558.358 0,26% 13
311 06-03-2020 FANGLOS LV LV $ 8.560.581 0,26% 10
312 06-03-2020 AANGLOS LV LV $ 63.596.531 0,26% 59
313 06-03-2020 FANGLOS LV BCI $ 45.678.549 0,26% 31
314 19-05-2020 FANGLOS BCI BCI $ 849.422.583 0,22% 17
答案 2 :(得分:-1)
该网站看起来使用了javascript加载器。查看Selenium Waits,直到页面完全加载。