我的Java Socket不适用于Google.com,但适用于其他网站

时间:2014-05-13 17:49:08

标签: java apache sockets

我在网上发现了一些代码来帮助我在Java中使用HTTP。 我特别在Apache HttpCore tutorial site找到了这段代码。

有趣的是,当我将主机名设为www.google.com时,响应为6行HTTP 302,表示页面已移动。

但是当我加入另一个随机网站时,如www.booya.com,我得到了整个HTML页面的完整回复,正如我所期待的那样?

发生了什么?谷歌是否有针对非浏览器的某种阻止机制?

以下是代码:

/*
 * ====================================================================
 * Licensed to the Apache Software Foundation (ASF) under one
 * or more contributor license agreements.  See the NOTICE file
 * distributed with this work for additional information
 * regarding copyright ownership.  The ASF licenses this file
 * to you under the Apache License, Version 2.0 (the
 * "License"); you may not use this file except in compliance
 * with the License.  You may obtain a copy of the License at
 *
 *   http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing,
 * software distributed under the License is distributed on an
 * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 * KIND, either express or implied.  See the License for the
 * specific language governing permissions and limitations
 * under the License.
 * ====================================================================
 *
 * This software consists of voluntary contributions made by many
 * individuals on behalf of the Apache Software Foundation.  For more
 * information on the Apache Software Foundation, please see
 * <http://www.apache.org/>.
 *
 */



import java.net.Socket;

import org.apache.http.ConnectionReuseStrategy;
import org.apache.http.HttpHost;
import org.apache.http.HttpResponse;
import org.apache.http.impl.DefaultBHttpClientConnection;
import org.apache.http.impl.DefaultConnectionReuseStrategy;
import org.apache.http.message.BasicHttpRequest;
import org.apache.http.protocol.HttpCoreContext;
import org.apache.http.protocol.HttpProcessor;
import org.apache.http.protocol.HttpProcessorBuilder;
import org.apache.http.protocol.HttpRequestExecutor;
import org.apache.http.protocol.RequestConnControl;
import org.apache.http.protocol.RequestContent;
import org.apache.http.protocol.RequestExpectContinue;
import org.apache.http.protocol.RequestTargetHost;
import org.apache.http.protocol.RequestUserAgent;
import org.apache.http.util.EntityUtils;

/**
 * Elemental example for executing multiple GET requests sequentially.
 */
public class ElementalHttpGet {

    public static void main(String[] args) throws Exception {
        HttpProcessor httpproc = HttpProcessorBuilder.create()
            // Required protocol interceptors
            .add(new RequestContent())
            .add(new RequestTargetHost())
            // Recommended protocol interceptors
            .add(new RequestConnControl())
            .add(new RequestUserAgent("Test/1.1"))
            // Optional protocol interceptors
            .add(new RequestExpectContinue(true)).build();

        HttpRequestExecutor httpexecutor = new HttpRequestExecutor();

        HttpCoreContext coreContext = HttpCoreContext.create();
        HttpHost host = new HttpHost("www.booya.com", 80);
        coreContext.setTargetHost(host);

        DefaultBHttpClientConnection conn = new DefaultBHttpClientConnection(8 * 1024);
        ConnectionReuseStrategy connStrategy = DefaultConnectionReuseStrategy.INSTANCE;

        try {

            String[] targets = {
                    "/",
                    };

            for (int i = 0; i < targets.length; i++) {
                if (!conn.isOpen()) {
                    Socket socket = new Socket(host.getHostName(), host.getPort());
                    conn.bind(socket);
                }
                BasicHttpRequest request = new BasicHttpRequest("GET", targets[i]);
                System.out.println(">> Request URI: " + request.getRequestLine().getUri());

                httpexecutor.preProcess(request, httpproc, coreContext);
                HttpResponse response = httpexecutor.execute(request, conn, coreContext);
                httpexecutor.postProcess(response, httpproc, coreContext);

                System.out.println("<< Response: " + response.getStatusLine());
                System.out.println(EntityUtils.toString(response.getEntity()));
                System.out.println("==============");
                if (!connStrategy.keepAlive(response, coreContext)) {
                    conn.close();
                } else {
                    System.out.println("Connection kept alive...");
                }
            }
        } finally {
            conn.close();
        }
    }

}

2 个答案:

答案 0 :(得分:1)

当某些东西适用于某些服务器而不适用于其他服务器时,可能就是它们的配置方式。

在这种情况下,Google在不同的端口中不再提供http,而是https。 302是一个代码(google for&#34; Http code&#34;),它指示客户端(网络浏览器,或者在本例中为您的程序)尝试连接到备用方向。

转到您的浏览器并输入网址http://www.google.com,您将看到如何将您重定向到https://www.google.com(或者可能是区域版本)。

从中学到的重要一点是HTTP代码的含义(至少是最常见的 - 200302401404,{{ 1}} - )

答案 1 :(得分:0)

来自维基百科:

HTTP响应状态代码302 Found是执行重定向的常用方法。

具有此状态代码的HTTP响应还将在Location头字段中提供URL。用户代理(例如,Web浏览器,[或者在这种情况下,您的Java程序])受到此代码的响应的邀请,以对位置字段中指定的新URL发出第二个(否则相同)请求。