尝试使用cUrl捕获ttps://www.target.com.au/的HTML
这可以通过命令行正常进行:
xxx@VirtualBox:~/workspace/$ /usr/bin/curl -L -v -k -i -H "Accept: text/html" -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36" --max-time 10 https://www.target.com.au/ > target
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 23.0.111.198...
* TCP_NODELAY set
* Connected to www.target.com.au (23.0.111.198) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
.....
这总是通过Java失败:
InputStream is = null;
try {
String command = "/usr/bin/curl -L -v -k -i -H \"Accept: text/html\" -H \"User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36\" --max-time 10 " + url;
System.out.println(command);
Process process = Runtime.getRuntime().exec(command);
is = process.getInputStream();
String body = IOUtils.toString(is, StandardCharsets.UTF_8);
return body;
} catch (IOException e) {
e.printStackTrace();
} finally {
IOUtils.closeQuietly(is);
}
输出:
/ usr / bin / curl -L -v -k -i -H“接受:文本/ html” -H“用户代理:Mozilla / 5.0(Windows NT 10.0; Win64; x64)AppleWebKit / 537.36(KHTML ,例如Gecko)Chrome / 75.0.3770.100 Safari / 537.36“ --max-time 10 https://www.target.com.au/
HTTP / 2 403服务器:AkamaiGHost MIME版本:1.0内容类型: text / html内容长度:270过期:2019年7月5日星期五,格林尼治标准时间 日期:2019年7月5日星期五,00:17:29 GMT设置cookie: akavpau_prod_maintenance_vp = 1562286149〜id = 979d0dae2676e513c633ce4f23c24ce0; 路径= / set-cookie: bm_sz = 11D295C99ED7B362AFBB7A506F5BBD77〜YAAQXrEHyloISrBrAQAAjzx + vwQpr3VkKRJiHodkXRN0RKsY9mAJuPB0g9bwOPcKkRcYltVyQ / K8f5vygv9S80T59R2NDJoF1Ei / 2nfEUUicPAdkhSpnWYdXZiBSv0TTqbXZeEauEzVff4OjwhhvL6sGI43knNUZbliMNZecBDuoXjDUvZD / O / JFAtxWR3lRI + T3; Domain = .target.com.au;路径= /; Expires = Fri,2019年7月5日04:17:29 GMT; 最大年龄= 14400; HttpOnly set-cookie: _abck = EA0253F20FADD24AE9333A50CD207788〜-1〜YAAQXrEHylsISrBrAQAAjzx + vwJv1VRMgxWDzHyBa1150prHaQO88ZGzl8kuNGzw3XRjLqOJaOMbT9mm5eWjT1pzMZq5WgyzPZM1 + pc3n0UDCkVSZCZzon4 / EXAkpbMNMeQfpHaurjCxf17U7javVptDE44op + nti7YNmdKUemKT / wMAL3RbWuPKwMRsduKFp1qyOQvK7tYOemfHd21YEFz / f1dGM + 4SNqxRECJD4U + ErNQYJd93q3Mfca6QgOz1sDhvSGUNghKBCovxdjwLjaW77iZfZMI5owWID57L5Q ==〜-1〜-1〜-1; Domain = .target.com.au;路径= /;到期日= 2020年7月4日星期六,格林尼治标准时间; 最大年龄= 31536000;安全
访问被拒绝
访问 拒绝
您无权访问 关于此的“ http://www.target.com.au/” 服务器。参考编号#18.5eb107ca.1562285849.864a545
在两种情况下(Ubuntu),同一台计算机上的网址位置都相同。
xxx@VirtualBox:~/workspace/$ curl --version
curl 7.58.0 (x86_64-pc-linux-gnu) libcurl/7.58.0 OpenSSL/1.1.1 zlib/1.2.11 libidn2/2.0.4 libpsl/0.19.1 (+libidn2/2.0.4) nghttp2/1.30.0 librtmp/2.3
Release-Date: 2018-01-24
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL