用Java中的套接字解​​析和发送HTTP请求的正确方法是什么?

时间:2019-04-10 23:13:39

标签: java sockets http

我正在创建一个基本的本地代理服务器,目标是接受来自我的Web浏览器的http和https流量,解析它以获取信息,将请求发送和接收到适当的主机,然后将其返回到Web浏览器。

我目前有一个打开的网络浏览器套接字。我收到来自浏览器的http和https请求,如下所示:

HTTP:

GET http://example.com/ HTTP/1.1 
Host: example.com User-Agent:
Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5 
Accept-Encoding: gzip, deflate
Connection: keep-alive 
Upgrade-Insecure-Requests: 1

HTTPS:

CONNECT example.com:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: example.com:443

我使用以下代码从上面打开“主机:”的套接字:

public void sendRequest() throws IOException{
        Socket socket = new Socket(host, port);
        //socket.getInputStream.read();
        BufferedWriter out = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), "UTF8"));
        BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
        for(int i = 0; i < lines.size(); i++){
            out.write(lines.get(i) + "\r\n");
        }
        out.flush();
        outputReturn(in);
    }

我收到这样的答复:

public void outputReturn(BufferedReader in){
        try{
            System.out.println("\n * Response");
            String line;
            while ((line = in.readLine()) != null) {
                System.out.println(line);
            }
        }
        catch (IOException i){
            System.out.println(i);
        }
    }

回复如下:

HTTP:

* Response
HTTP/1.1 200 OK
Content-Encoding: gzip
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Wed, 10 Apr 2019 22:53:28 GMT
Etag: "1541025663+gzip"
Expires: Wed, 17 Apr 2019 22:53:28 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (ord/4C92)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 606

;�R�TA��0
         ��W�ri]��S�V @���1k��Z��$�6���q۽���@+���l�I�I��s�PzUe���Bf
                                                                   �'��+�>���+�OF   �I4h��^@^
�ЧA�p@�M���u����������*
<�|ԅߎP���P�-�6�O��$}�Jl)ǰ_,�4yU�rQazw�r���t
                                           .�s���3�
                                                   z�_������2�Mel
                                                                 ϋ5����%�t
                                                                          뫪R���t3

��:�|�Q��]���
             V-z�|�Y3*���rKp�5th��"��C���NH����v��OOyޣ�xs�����V��$��X�6�BR�b�C��PqE���K�<�  �G�כ7����E(17Vx2�US��
%   x��)�d�����e��O&�4/䤘���~��Oi�s�X�dW�7��#�u�"��y\$]j<�L�r�˻'�ɪ�Vg?Kr {=��΋]E��^x;�ƱX
                                                                                            TU��]�[�{��s+�e����9�g���]����H�4���#�KA��'�Z�����*r�
�$�G�   ��4�n�8���㊄+c���E�hA��X���������L��RIt�[4\����

HTTPS:

CONNECT getpocket.cdn.mozilla.net:443 HTTP/1.1
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:66.0) Gecko/20100101 Firefox/66.0
Proxy-Connection: keep-alive
Connection: keep-alive
Host: getpocket.cdn.mozilla.net:443


 * Response
java.net.SocketException: Connection reset

问题:

为什么我从HTTP请求中收到看起来像二进制的东西?

为什么我的HTTPS请求什么都收不到?

我应该怎么做?

谢谢。

1 个答案:

答案 0 :(得分:2)

对于您的HTTP请求,Content-Encodinggzip。二进制文件是gzip压缩的数据。

对于HTTPS请求,您没有进行SSL / TLS握手,因此服务器断开了连接。

对于HTTP,我认为您不需要做任何事情,浏览器应该为您处理。没有使用您描述的方法代理HTTPS / SSL / TLS的可行方法。