Colly简单的HTTP身份验证无法持续

时间:2019-12-21 15:53:05

标签: authentication go web-crawler

我正在使用Go和gocolly从站点(https://www.albeco.com.pl/)收集数据,我需要登录才能查看价格。 但是身份验证无法持续。

package main
import (
    "fmt"
    "log"
    "github.com/gocolly/colly"
)

func main() {
    // create a new collector
    c := colly.NewCollector()

    // authenticate real user and password
    err := c.Post("https://www.albeco.com.pl/EN-H98.html", map[string]string{
        "login_form": "cpsilva1",
        "haslo_form": "contem1g",
        "loguj":      "Log+in",
    })
    if err != nil {
        log.Fatal(err)
    }

    // Extract details of the course
    c.OnHTML(`table[id=searchResults]`, func(e *colly.HTMLElement) {
        e.ForEach("tr", func(_ int, tr *colly.HTMLElement) {
            tr.ForEach("td", func(i int, tds *colly.HTMLElement) {
                fmt.Println(i, tds.Text)
            })
        })
    })

    // attach callbacks after login
    c.OnResponse(func(r *colly.Response) {
        log.Println("response received", r.StatusCode)
    })

    // start scraping
    c.Visit("https://www.albeco.com.pl/EN-H160/online-store.html?symbol=*&nazwa=&producent=115&sr_wewn=&sr_zewn=&szerokosc=&nazwa_symbol=on&filtered=off&w_srodku=on&zerowe=on&precyzja=0%2C9")
}

带有身份验证的表(我需要它) Table with authentication

未认证的表 Table with not authentication

0 个答案:

没有答案