我正在尝试从网站上获取两个页面并将其解析为PDF文件 但我无法通过登录
使用VBA-WinHttp.WinHttpRequest.5.1
获取登录页面的请求以获取标题
解析标头以提取Cookie csrftoken = vmNumUGhoTH9DKrKcHUFouOG0pm2AFJP
发布请求
.setRequestHeader Cookie,csrftoken=vmNumUGhoTH9DKrKcHUFouOG0pm2AFJP
.send "username=MyUserName&password=MyPass"
但是我得到的响应是CSRF验证失败。请求中止禁止403
我发现缺少某些内容,我认为这与页面上的某些脚本有关。
当我使用浏览器登录并调试页面时,我可以看到三个cookie
Name csrftoken value vmNumUGhoTH9DKrKcHUFouOG0pm2AFJP
Name io value ugmkC7arELPf2tDPABks ,Session ,true, false
Name Sessionid value ".ejguge9 & a long string) , Session, true , true
这些是初始GET请求中的标头.getAllResponseHeaders
Server: nginx
Date: Tue, 17 Sep 2019 12:24:08 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Content-Security-Policy: default-src 'self' cdn.ravenjs.com app.getsentry.com ws: wss: www.google-analytics.com; style-src 'self' 'unsafe-inline'; font-src 'self' data:
Expires: Tue, 17 Sep 2019 12:24:08 GMT
Vary: Cookie
Last-Modified: Tue, 17 Sep 2019 12:24:08 GMT
Cache-Control: max-age=0
X-Frame-Options: SAMEORIGIN
Set-Cookie: csrftoken=vmNumUGhoTH9DKrKcHUFouOG0pm2AFJP; expires=Tue, 15-Sep-2020 12:24:08 GMT; Max-Age=31449600; Path=/
Cache-Control: no-cache,no-store,must-revalidate,private,no-transform
Pragma: no-cache
X-Content-Type-Options: nosniff
,这是responseText。 (我已删除了一些Divs)
<!DOCTYPE html>
<html lang="en" ng-csp
data-analytics-id="UA-37377084-19"
data-analytics-domain="service.gov.uk">
<head data-sentry-dsn="//a69cb33017d040eb861baa66122a3fe4@sentry.service.dsd.io/27"
data-sentry-site="[Prod] Frontend"
data-socketio-server="/socket.io">
<meta http-equiv="X-UA-Compatible" content="IE=9; IE=8; IE=7; IE=EDGE" />
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<base href="/">
<title>Advice</title>
<!-- webfonts -->
<!-- main styles -->
<!-- print styles -->
<script src="/static/javascripts/vendor/raven.min.315b372f50d2.js"></script>
<script type="text/javascript" src="/static/javascripts/vendor/raven.config.1385d9f1498c.js"></script>
<script type="text/javascript" src="/static/javascripts/vendor/boomerang/boomerang.4dd8ca6c7426.js"></script>
<script type="text/javascript" src="/static/javascripts/vendor/boomerang/config.4264a336e731.js"></script>
</head>
<body class="service alpha v-Login">
<!--end header-->
<main id="wrapper">
<div class="Grid">
<div class="Grid-row cf">
<form action="" name="login_frm" method="post" autocomplete="off">
<input type='hidden' name='csrfmiddlewaretoken' value='vmNumUGhoTH9DKrKcHUFouOG0pm2AFJP' />
<input autocomplete="off" class="js-remove-readonly-onfocus" id="id_username" maxlength="254" name="username" readonly="True" type="text" />
<input autocomplete="off" class="js-remove-readonly-onfocus" id="id_password" name="password" readonly="True" type="password" />
<div class="FormActions">
<button type="submit" value="signin" name="login-submit" class="Button">Sign in</button>
</div>
</form>
</div>
</div>
</main>
<script src="/static/javascripts/lib.min.dcd886162895.js"></script>
<script src="/static/javascripts/vendor/vanilla-js.d652e4418fdd.js"></script>
<script src="/static/javascripts/cla.main.min.6f16f17218b1.js"></script>
<script src="/static/javascripts/vendor/analytics.641aebbe3fe5.js"></script>
</body>
</html>