通过请求从XHR抓取数据

时间:2019-06-13 11:56:56

标签: python ajax web-scraping request

我要抓取数据of this website。但是我得到的结果与网站上发布的结果不同。例如,当我运行代码时,如果时间为14000,时长为48个月,则TAE为7.03,而网站上的值为6.44。我认为参数设置错误。有人可以帮我吗?

我无法通过多种方式更改参数。我不知道如何找到正确的参数。

import requests
from bs4 import BeautifulSoup
import re
import json
import pandas as pd

#Let's first collect few auth vars
r = requests.Session()
response = r.get("https://simuladores.bancosantander.es/SantanderES/loansimulatorweb.aspx?por=webpublica&prv=publico&m=100&cta=1&ls=0#/t0")
soup = BeautifulSoup(response.content, 'html')
key = soup.find_all('script',text=re.compile('Afi.AfiAuth.Init'))
pattern = r"Afi.AfiAuth.Init\((.*?)\)"

WSSignature = re.findall(pattern,key[0].text)[0].split(',')[-1].replace('\'','')
WSDateTime = re.findall(pattern,key[0].text)[0].split(',')[1].replace('\'','')

headers = {
    'Origin': 'https://simuladores.bancosantander.es',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36',
    'Content-Type': 'application/json;charset=UTF-8',
    'Accept': 'application/json, text/plain, */*',
    'WSSignature': WSSignature,
    'Referer': 'https://simuladores.bancosantander.es/SantanderES/loansimulatorweb.aspx?por=webpublica&prv=publico&m=100&cta=1&ls=0',
    'WSDateTime': WSDateTime,
    'WSClientCode': 'SantanderES',
}

#Those are the standard params of a request
params = {'wsInputs': {'finality': 'prestamo coche',
  'productCode': 'p100',
  'capitalOrInstallment': 5000,
  'monthsTerm': 96,
  'mothsInitialTerm': 12,
  'openingCommission': 1.5,
  'minOpeningCommission': 60,
  'financeOpeningCommission': True,
  'interestRate': 5.5,
  'interestRateReferenceIndex': 0,
  'interestRateSecondaryReferenceIndex': 0,
  'interestRateSecondaryWithoutVinculation': 6.5,
  'interestRateSecondaryWithAllVinculation': 0,
  'interestRateSecondary': 6.5,
  'loanDate': '2019-06-13',
  'birthDate': '2001-06-13',
  'financeLoanProtectionInsurance': True,
  'percentageNotaryCosts': 0.003,
  'loanCalculationMethod': 0,
  'calculationBase': 4,
  'frecuencyAmortization': 12,
  'frecuencyInterestPay': 12,
  'calendarConvention': 0,
  'taeCalculationBaseType': 4,
  'lackMode': 0,
  'amortizationCarencyMonths': 0,
  'typeAmortization': 1,
  'insuranceCostSinglePremium': 0,
  'with123': False,
  'electricVehicle': False}}
#The scraping function
def scrap(amount, duration, params):

    params['wsInputs']['capitalOrInstallment'] = amount
    params['wsInputs']['monthsTerm'] = duration
    response = r.post('https://simuladores.bancosantander.es/WS/WSSantanderTotalLoan.asmx/Calculate', headers=headers, data=json.dumps(params))
    return json.loads(response.content)['d']


Amounts = [13000]
Durations = [ 48, 60, 72, 84, 96]
results = []
for amount in Amounts:
    for duration in Durations:
        result = scrap(amount, duration, params)
        result['Amount'] = amount
        result['Duration'] = duration
        results.append(result)

df = pd.DataFrame(results)

1 个答案:

答案 0 :(得分:2)

首先,正如@Richard所说,您的代码没有错。

之所以得到7.03%而不是6.44%的原因是因为您使用的贷款模拟器以某种方式作弊(看起来更具竞争力)。您与众不同之处在于对 financia的投资的考虑。这意味着,如果将标准参数'openingCommission'设置为0,则会得到 6.45%。确切获得 6.44%怎么样?一个建议。


说明(使用法语术语

如果我计算与超参数{14k€,48months,330.47€/ m}相关的TEG和TAE,则得到 6.26% 6.44%。 / p>

但是,如果我进行相同的计算,包括 210€金融投资,我将得到〜7.03% 7.23%

enter image description here

internal rate of return (IRR)上方(和下方)的 i 代表enter image description here,即使方程(E1)失效的速率:

{{3}}


这意味着您应该考虑将IRR求解器集成到您的工作流中,使用可用的信息(惯常,持续时间,总金额甚至费用)来重新计算TAE。