如何使用chromedp获取HTTP响应体?

时间:2017-08-22 04:13:04

标签: google-chrome http go browser-automation

使用github.com/knq/chromedp,一个go软件包来使用Chrome调试协议驱动网络浏览器,我可以导航到网页,更新表单和提交表单,但我需要检索一个HTTP响应正文并避开'我想出了怎么做。我希望能够为JSON响应(而不是HTML)检索HTTP响应主体。

通过查看代码,似乎HTTP响应主体位于CachedResponse.Body属性中:

https://github.com/knq/chromedp/blob/b9e4c14157325be092c1c1137edbd584648d8c72/cdp/cachestorage/types.go#L30

并且应该可以使用以下方式访问:

func (p *RequestCachedResponseParams) Do(ctxt context.Context, h cdp.Handler) (response *CachedResponse, err error)

https://github.com/knq/chromedp/blob/b9e4c14157325be092c1c1137edbd584648d8c72/cdp/cachestorage/cachestorage.go#L168

示例使用cdp.Tasks,例如以下简单示例。

func googleSearch(q, text string, site, res *string) cdp.Tasks {
    var buf []byte
    sel := fmt.Sprintf(`//a[text()[contains(., '%s')]]`, text)
    return cdp.Tasks{
        cdp.Navigate(`https://www.google.com`),
        cdp.Sleep(2 * time.Second),
        cdp.WaitVisible(`#hplogo`, cdp.ByID),
        cdp.SendKeys(`#lst-ib`, q+"\n", cdp.ByID),
        cdp.WaitVisible(`#res`, cdp.ByID),
        cdp.Text(sel, res),
        cdp.Click(sel),
        cdp.Sleep(2 * time.Second),
        cdp.WaitVisible(`#footer`, cdp.ByQuery),
        cdp.WaitNotVisible(`div.v-middle > div.la-ball-clip-rotate`, cdp.ByQuery),
        cdp.Location(site),
        cdp.Screenshot(`#testimonials`, &buf, cdp.ByID),
        cdp.ActionFunc(func(context.Context, cdptypes.Handler) error {
            return ioutil.WriteFile("testimonials.png", buf, 0644)
        }),
    }
}

https://github.com/knq/chromedp/blob/b9e4c14157325be092c1c1137edbd584648d8c72/examples/simple/main.go

通过引用CachedResponse.Body调用RequestCachedResponseParams.Do()似乎可以访问RequestCachedResponseParams.CacheID,但仍需要以下内容::

  1. 如何在RequestCachedResponseParams.Do()中致电cdp.Tasks - 似乎可以使用cdp.ActionFunc()
  2. 如何访问RequestCachedResponseParams.CacheID

1 个答案:

答案 0 :(得分:0)

如果您想获得请求响应,那就是我设法做到的方式

此示例调用http://www.google.com,并监听EventResponseReceived以保留包含示例头的Response

package main

import (
    "context"
    "io/ioutil"
    "log"
    "os"
    "time"

    "github.com/chromedp/cdproto/network"
    "github.com/chromedp/chromedp"
)

func main() {
    dir, err := ioutil.TempDir("", "chromedp-example")
    if err != nil {
        panic(err)
    }
    defer os.RemoveAll(dir)

    opts := append(chromedp.DefaultExecAllocatorOptions[:],
        chromedp.DisableGPU,
        chromedp.NoDefaultBrowserCheck,
        chromedp.Flag("headless", false),
        chromedp.Flag("ignore-certificate-errors", true),
        chromedp.Flag("window-size", "50,400"),
        chromedp.UserDataDir(dir),
    )

    allocCtx, cancel := chromedp.NewExecAllocator(context.Background(), opts...)
    defer cancel()

    // also set up a custom logger
    taskCtx, cancel := chromedp.NewContext(allocCtx, chromedp.WithLogf(log.Printf))
    defer cancel()

    // create a timeout
    taskCtx, cancel = context.WithTimeout(taskCtx, 10*time.Second)
    defer cancel()

    // ensure that the browser process is started
    if err := chromedp.Run(taskCtx); err != nil {
        panic(err)
    }

    // listen network event
    listenForNetworkEvent(taskCtx)

    chromedp.Run(taskCtx,
        network.Enable(),
        chromedp.Navigate(`http://www.google.com`),
        chromedp.WaitVisible(`body`, chromedp.BySearch),
    )

}

func listenForNetworkEvent(ctx context.Context) {
    chromedp.ListenTarget(ctx, func(ev interface{}) {
        switch ev := ev.(type) {

        case *network.EventResponseReceived:
            resp := ev.Response
            if len(resp.Headers) != 0 {
                log.Printf("received headers: %s", resp.Headers)

            }

        }
        // other needed network Event
    })
}