无法获得准确有效的最终网址

时间:2017-11-06 06:34:12

标签: go httprequest url-redirection

使用Golang的http.Get(),我可以在多次重定向后获得有效或最终的URL,但在少数情况下,在URL中有303个重定向和特殊字符golang很奇怪,我无法得到实际的最终网址。以下是我正在处理的示例 - “http://swiggy.com//google.com/%2f ..”,如果我们在浏览器中打开此网址,我们会重定向到谷歌,但我无法使用http.Get()

1 个答案:

答案 0 :(得分:2)

其他方面的行为可能取决于很多因素 - 例如您使用的用户代理,Cookie,IP等。有时它也会因为DDoS保护机制或类似的东西而改变。

您可以修改您的应用并了解它如何通过重定向阶段:

结果:

getURL: http://swiggy.com//google.com/%2f..
Redirecting: 301 https://swiggy.com/google.com/%2f..
Redirecting: 301 https://www.swiggy.com/google.com/%2f..
Redirecting: 303 https://www.swiggy.com/google.com/%2f../
finalURL: https://www.swiggy.com/google.com/%2f../
Req Headers: map[Referer:[https://www.swiggy.com/google.com/%2f..]]
Resp Headers: map[Date:[Mon, 06 Nov 2017 12:51:20 GMT] Content-Type:[text/html; charset=utf-8] Content-Security-Policy-Report-Only:[default-src 'self';script-src https://chuknu.sokrati.com/15946/ https://www.google-analytics.com/ https://cdn.inspectlet.com/ https://tracking.sokrati.com/ https://connect.facebook.net/ https://bam.nr-data.net/ https://maps.googleapis.com/ https://js-agent.newrelic.com/ https://www.googletagmanager.com/ https://s3-ap-southeast-1.amazonaws.com/static.swiggy/ https://*.juspay.in https://connect.facebook.net/ https://www.googletagmanager.com/ *.swiggy.in *.swiggy.com https://chat2.hotline.io/ 'self' 'unsafe-inline' 'unsafe-eval' 'nonce-150997268072300';style-src https://fonts.googleapis.com/ https://www.swiggy.com/ https://s3-ap-southeast-1.amazonaws.com/static.swiggy/ https://chat2.hotline.io/ 'self' 'unsafe-inline' 'unsafe-eval';img-src https://res.cloudinary.com/swiggy/ https://www.google-analytics.com/ https://www.google.co.in/ https://www.facebook.com/ https://tracking.sokrati.com/ http://api.swiggy.in/ https://api.swiggy.com https://d3oxf4lkkqx2kx.cloudfront.net/ https://maps.googleapis.com/ https://maps.gstatic.com/ https://csi.gstatic.com/ https://fonts.gstatic.com/ https://stats.g.doubleclick.net/ https://googleads.g.doubleclick.net/ https://www.google.com/ data: 'self'; font-src https://www.swiggy.com/ https://fonts.gstatic.com/ data: 'self';connect-src https://hn.inspectlet.com/ https://www.swiggy.com/ https://www.facebook.com/tr/ https://*.juspay.in/txns https://sentry.swiggyapp.com/ 'self';frame-src https://www.facebook.com/tr/ https://chat2.hotline.io/ https://*.webpush.hotline.io 'self';report-uri /csp/log] Etag:[W/"6f97-"] Vary:[Accept-Encoding] X-Data-Origin:[dweb_cluster/port-dweb-06 naxsi/waf rate-limiter-plain/rate-limiter-plain] X-Xss-Protection:[1; mode=block] Strict-Transport-Security:[max-age=31536000; includeSubdomains; preload] X-Frame-Options:[Deny] Set-Cookie:[__SW=sjfsljfd; Path=/]]

修改后的代码:

package main

import (
    "fmt"
    "net/http"
)

func CheckRedirect(r *http.Request, via []*http.Request) error {
    fmt.Println("Redirecting:", r.Response.StatusCode, r.URL)
    return nil
}

func main() {
    getURL := "http://swiggy.com//google.com/%2f.."
    fmt.Println("getURL:", getURL)
    client := &http.Client{
        CheckRedirect: CheckRedirect,
    }
    resp, err := client.Get(getURL)
    if err != nil {
        fmt.Println(err)
        return
    }
    finalURL := resp.Request.URL.String()
    fmt.Println("finalURL:", finalURL)
    fmt.Println("Req Headers:", resp.Request.Header)
    fmt.Println("Resp Headers:", resp.Header)
}