我在网上发现了一些代码来帮助我在Java中使用HTTP。 我特别在Apache HttpCore tutorial site找到了这段代码。
有趣的是,当我将主机名设为www.google.com
时,响应为6行HTTP 302
,表示页面已移动。
但是当我加入另一个随机网站时,如www.booya.com
,我得到了整个HTML页面的完整回复,正如我所期待的那样?
发生了什么?谷歌是否有针对非浏览器的某种阻止机制?
以下是代码:
/*
* ====================================================================
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
* ====================================================================
*
* This software consists of voluntary contributions made by many
* individuals on behalf of the Apache Software Foundation. For more
* information on the Apache Software Foundation, please see
* <http://www.apache.org/>.
*
*/
import java.net.Socket;
import org.apache.http.ConnectionReuseStrategy;
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.impl.DefaultBHttpClientConnection;
import org.apache.http.impl.DefaultConnectionReuseStrategy;
import org.apache.http.message.BasicHttpRequest;
import org.apache.http.protocol.HttpCoreContext;
import org.apache.http.protocol.HttpProcessor;
import org.apache.http.protocol.HttpProcessorBuilder;
import org.apache.http.protocol.HttpRequestExecutor;
import org.apache.http.protocol.RequestConnControl;
import org.apache.http.protocol.RequestContent;
import org.apache.http.protocol.RequestExpectContinue;
import org.apache.http.protocol.RequestTargetHost;
import org.apache.http.protocol.RequestUserAgent;
import org.apache.http.util.EntityUtils;
/**
* Elemental example for executing multiple GET requests sequentially.
*/
public class ElementalHttpGet {
public static void main(String[] args) throws Exception {
HttpProcessor httpproc = HttpProcessorBuilder.create()
// Required protocol interceptors
.add(new RequestContent())
.add(new RequestTargetHost())
// Recommended protocol interceptors
.add(new RequestConnControl())
.add(new RequestUserAgent("Test/1.1"))
// Optional protocol interceptors
.add(new RequestExpectContinue(true)).build();
HttpRequestExecutor httpexecutor = new HttpRequestExecutor();
HttpCoreContext coreContext = HttpCoreContext.create();
HttpHost host = new HttpHost("www.booya.com", 80);
coreContext.setTargetHost(host);
DefaultBHttpClientConnection conn = new DefaultBHttpClientConnection(8 * 1024);
ConnectionReuseStrategy connStrategy = DefaultConnectionReuseStrategy.INSTANCE;
try {
String[] targets = {
"/",
};
for (int i = 0; i < targets.length; i++) {
if (!conn.isOpen()) {
Socket socket = new Socket(host.getHostName(), host.getPort());
conn.bind(socket);
}
BasicHttpRequest request = new BasicHttpRequest("GET", targets[i]);
System.out.println(">> Request URI: " + request.getRequestLine().getUri());
httpexecutor.preProcess(request, httpproc, coreContext);
HttpResponse response = httpexecutor.execute(request, conn, coreContext);
httpexecutor.postProcess(response, httpproc, coreContext);
System.out.println("<< Response: " + response.getStatusLine());
System.out.println(EntityUtils.toString(response.getEntity()));
System.out.println("==============");
if (!connStrategy.keepAlive(response, coreContext)) {
conn.close();
} else {
System.out.println("Connection kept alive...");
}
}
} finally {
conn.close();
}
}
}
答案 0 :(得分:1)
当某些东西适用于某些服务器而不适用于其他服务器时,可能就是它们的配置方式。
在这种情况下,Google在不同的端口中不再提供http,而是https。 302是一个代码(google for&#34; Http code&#34;),它指示客户端(网络浏览器,或者在本例中为您的程序)尝试连接到备用方向。
转到您的浏览器并输入网址http://www.google.com
,您将看到如何将您重定向到https://www.google.com
(或者可能是区域版本)。
从中学到的重要一点是HTTP代码的含义(至少是最常见的 - 200
,302
,401
,404
,{{ 1}} - )
答案 1 :(得分:0)
来自维基百科:
HTTP响应状态代码302 Found是执行重定向的常用方法。
具有此状态代码的HTTP响应还将在Location头字段中提供URL。用户代理(例如,Web浏览器,[或者在这种情况下,您的Java程序])受到此代码的响应的邀请,以对位置字段中指定的新URL发出第二个(否则相同)请求。