在Go中解析来自文本文件的HTTP请求和响应

时间:2015-11-27 19:04:52

标签: go

给定以下文件,该文件包含HTTP流水线的HTTP请求和HTTP响应流。

如何将此文件解析为我的stream变量?

type Connection struct{
   Request *http.Request
   Response *http.Response
}
stream := make([]Connection, 0)

原始文件:

GET /ubuntu/dists/trusty/InRelease HTTP/1.1
Host: archive.ubuntu.com
Cache-Control: max-age=0
Accept: text/*
User-Agent: Debian APT-HTTP/1.3 (1.0.1ubuntu2)

HTTP/1.1 404 Not Found
Date: Thu, 26 Nov 2015 18:26:36 GMT
Server: Apache/2.2.22 (Ubuntu)
Vary: Accept-Encoding
Content-Length: 311
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /ubuntu/dists/trusty/InRelease was not found on this server.</p>
<hr>
<address>Apache/2.2.22 (Ubuntu) Server at archive.ubuntu.com Port 80</address>
</body></html>
GET /ubuntu/dists/trusty-updates/InRelease HTTP/1.1
Host: archive.ubuntu.com
Cache-Control: max-age=0
Accept: text/*
User-Agent: Debian APT-HTTP/1.3 (1.0.1ubuntu2)

HTTP/1.1 200 OK
Date: Thu, 26 Nov 2015 18:26:37 GMT
Server: Apache/2.2.22 (Ubuntu)
Last-Modified: Thu, 26 Nov 2015 18:03:00 GMT
ETag: "fbb7-5257562a5fd00"
Accept-Ranges: bytes
Content-Length: 64439
Cache-Control: max-age=382, proxy-revalidate
Expires: Thu, 26 Nov 2015 18:33:00 GMT

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Origin: Ubuntu
Label: Ubuntu
Suite: trusty-updates
Version: 14.04
Codename: trusty
[... truncated by author]

我知道有http.ReadRequest。响应怎么样?任何想法/反馈/想法都表示赞赏。

1 个答案:

答案 0 :(得分:3)

实际上非常简单:

package main

import (
    "bufio"
    "bytes"
    "fmt"
    "io"
    "io/ioutil"
    "log"
    "net/http"
    "net/http/httputil"
    "os"
)

type Connection struct {
    Request  *http.Request
    Response *http.Response
}

func ReadHTTPFromFile(r io.Reader) ([]Connection, error) {
    buf := bufio.NewReader(r)
    stream := make([]Connection, 0)

    for {
        req, err := http.ReadRequest(buf)
        if err == io.EOF {
            break
        }
        if err != nil {
            return stream, err
        }

        resp, err := http.ReadResponse(buf, req)
        if err != nil {
            return stream, err
        }

        //save response body
        b := new(bytes.Buffer)
        io.Copy(b, resp.Body)
        resp.Body.Close()
        resp.Body = ioutil.NopCloser(b)

        stream = append(stream, Connection{Request: req, Response: resp})
    }
    return stream, nil

}
func main() {
    f, err := os.Open("/tmp/test.http")
    if err != nil {
        log.Fatalln(err)
    }
    defer f.Close()
    stream, err := ReadHTTPFromFile(f)
    if err != nil {
        log.Fatalln(err)
    }
    for _, c := range stream {
        b, err := httputil.DumpRequest(c.Request, true)
        if err != nil {
            log.Fatalln(err)
        }
        fmt.Println(string(b))
        b, err = httputil.DumpResponse(c.Response, true)
        if err != nil {
            log.Fatalln(err)
        }
        fmt.Println(string(b))
    }
}

一些注意事项:

  • http.ReadRequesthttp.ReadResponse
  • http.ReadRequesthttp.ReadResponse可以在同一个budio.Reader上一遍又一遍地调用,直到EOF,它将会#34;正常工作&#34;
    • &#34;刚刚工作&#34;取决于Content-Length标题的存在和正确,因此阅读正文会将Reader放在下一个请求/响应的开头
    • Read the code要明确了解哪些内容有效,哪些内容
  • resp.Body必须按照文档Close编辑,因此我们必须将其复制到另一个缓冲区才能保留
  • 使用您的示例数据(修改Content-Length以匹配截断),此代码将输出与给定的相同的请求和响应
  • httputil.DumpRequesthttputil.DumpResponse不一定按照与输入文件相同的顺序转储HTTP标头,因此不要期望diff完美< / LI>