网络抓取-我得到的是标签,但没有值

时间:2019-05-23 13:22:33

标签: python web-scraping beautifulsoup

我正在尝试从网站上获取一些价值,但我得到的唯一是标签。我要获取的变量的实际值为空。

我正在使用请求和BeautifulSoup。


import requests
from bs4 import BeautifulSoup

request = requests.get("https://www.cofidis.pt/cofidis/cofidisredirect.aspx?Prazo=48&IDPartner=6708&Montante=10000&Seguro=0&IDOferta=20719&IDFinalidade=6&IDFinalidadeOption=100&DesignacaoFinalidade=Outros%20Projetos&origem=")
soup = BeautifulSoup(request.text, 'html.parser')    
text = soup.find(id="micro-simulador")
print(text.get_text())

但是我只得到标签:


Resumo do seu pedido

Outros Projetos

Montante

Prazo

Mensalidade

TAEG

Seguro

TAN

MTIC

...

目标是获取“微型模拟器”中的值,例如TAEG = 11.0%。

谁能告诉我出什么问题了?

3 个答案:

答案 0 :(得分:1)

由于这些值在输入标签内,因此您可以按以下方式获取它们:

import requests
from bs4 import BeautifulSoup

request = requests.get("https://www.cofidis.pt/cofidis/cofidisredirect.aspx?Prazo=48&IDPartner=6708&Montante=10000&Seguro=0&IDOferta=20719&IDFinalidade=6&IDFinalidadeOption=100&DesignacaoFinalidade=Outros%20Projetos&origem=")
soup = BeautifulSoup(request.text, 'html.parser')    
text = soup.find(id="micro-simulador")
inputs = text.findAll('input')

for input_tag in inputs:
    print(input_tag.get('id'))
    print(input_tag.get('value'))

答案 1 :(得分:0)

正如chitown88所说,存在一个JSON请求,您可以解析该请求并获得相同的结果,请参见以下示例:

import requests

ID_PARTNER = 6708
ID_FINALIDADE = 6
PRAZO = 48
MONTANTE = 10000


def parse_request():
    url = 'https://www.cofidis.pt/Sim/wsGeralRest.svc/MontantePrazos/%s/%s' % (ID_PARTNER, ID_FINALIDADE)
    response = requests.request('GET', url)
    if response.ok:
        content = response.json()['COF_GET_MontantePrazos_RestResult']

        for montante_prazo in content['MontantesPrazos']:
            if montante_prazo['MNT'] == MONTANTE:
                montante_prazo['PM'] = list(filter(lambda v: v['PRZ'] == PRAZO, montante_prazo['PM']))
                return montante_prazo


print(parse_request())

输出:

{
  'PM': [
    {
      'PRZ': 48,
      'TAN': 8.9,
      'MES': 270.29,
      'IDO': 20719,
      'MSA': 254.64,
      'TAEG': 11.0,
      'IDE': 182587376,
      'MNS': 250.07,
      'DCM': 0.0,
      'ITO': 588,
      'PSA': 48,
      'MTI': 12243.36,
      'PRS': 48
    }
  ],
  'MNT': 10000.0,
  'DES': 'Crédito Pessoal',
  'IDP': 6708,
  'IDF': 6,
  'IDS': 1932
}

答案 2 :(得分:0)

您可以通过XHR获得整个json响应。至于到底需要什么,将由您决定,因为我不理解非英语标签。

import requests

url = 'https://www.cofidis.pt/Sim/wsGeralRest.svc/MontantePrazos/6708/6'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36'           }

jsonData = requests.get(url, headers=headers).json()