如何在Go语言中获取属性href值的值。

时间:2015-08-23 21:02:07

标签: go

我想从html内容中解析锚链接。 / *我的HTML内容示例

<a class="productnamecolor colors_productname" href="http://www.cccxcxc.com/Nautical-Bubble-Romper-p/s15brpnt03.htm">*/
      <a class="productnamecolor colors_productname" href="http://www.dewewe.com/Nautical-Bubble-Romper-p/erewrwer.htm">
    <a class="productnamecolor colors_productname" href="http://www.sdsddsd.com/Nautical-Bubble-Romper-p/dsadadasd.htm"> 

* / Anchor有 href ,我希望得到 Href 的值。但这给了我错误..

错误:单值上下文中的多值s.Attr()

package main

    import (
      "fmt"
      "log"

      "github.com/PuerkitoBio/goquery"
    )

    func ExampleScrape() {
      doc, err := goquery.NewDocument("http://www.myurl.com/category-s/1828.htm") 
      if err != nil {
        log.Fatal(err)
      }

    /* **my sample html after http open** <a class="productnamecolor colors_productname" href="http://www.cccxcxc.com/Nautical-Bubble-Romper-p/s15brpnt03.htm">*/
      <a class="productnamecolor colors_productname" href="http://www.dewewe.com/Nautical-Bubble-Romper-p/erewrwer.htm">
    <a class="productnamecolor colors_productname" href="http://www.sdsddsd.com/Nautical-Bubble-Romper-p/dsadadasd.htm"> ***/

    doc.Find("table.v65-productDisplay a.productnamecolor").Each(func(i int, s *goquery.Selection) {
        band := s.Attr("href") // here i want to get attribute " href " value. this is not working here.
        fmt.Printf(band)
      })
    }

    func main() {
      ExampleScrape()
    }

3 个答案:

答案 0 :(得分:6)

Selection.Attr返回两个值:属性值和一个布尔值,说明属性是否存在(如果为false,属性值将为空)。

当您忽略多个返回值时,Go不喜欢它,因此您必须将代码更改为以下内容:

var groups = ordered.GroupBy(k => new  {
            a = !String.IsNullOrEmpty(SelectedFirstCategory) ? k[SelectedFirstCategory] : null,
            b = !String.IsNullOrEmpty(SelectedSecondCategory) ? k[SelectedSecondCategory] : null,
            c = !String.IsNullOrEmpty(SelectedThirdCategory) ? k[SelectedThirdCategory] : null
        });

答案 1 :(得分:0)

您还可以使用golang.org/pkg/net/html软件包。

package main

import (
    "fmt"
    "log"
    "strings"

    "golang.org/x/net/html"
)

func main() {
    s := `<a class="productnamecolor colors_productname" href="http://www.cccxcxc.com/Nautical-Bubble-Romper-p/s15brpnt03.htm">*/
      <a class="productnamecolor colors_productname" href="http://www.dewewe.com/Nautical-Bubble-Romper-p/erewrwer.htm">
    <a class="productnamecolor colors_productname" href="http://www.sdsddsd.com/Nautical-Bubble-Romper-p/dsadadasd.htm">`
    doc, err := html.Parse(strings.NewReader(s))
    if err != nil {
        log.Fatal(err)
    }
    var f func(*html.Node)
    f = func(n *html.Node) {
        if n.Type == html.ElementNode && n.Data == "a" {
            for _, a := range n.Attr {
                if a.Key == "href" {
                    fmt.Println(a.Val)
                    break
                }
            }
        }
        for c := n.FirstChild; c != nil; c = c.NextSibling {
            f(c)
        }
    }
    f(doc)
}
// outputs
// http://www.cccxcxc.com/Nautical-Bubble-Romper-p/s15brpnt03.htm
// http://www.dewewe.com/Nautical-Bubble-Romper-p/erewrwer.htm
// http://www.sdsddsd.com/Nautical-Bubble-Romper-p/dsadadasd.htm

希望这对某人有帮助。

答案 2 :(得分:0)

我写了一个可以处理这个问题的替代包:

package main

import (
   "github.com/89z/mech"
   "strings"
)

const source = `
<a class="productnamecolor colors_productname" href="http://www.cccxcxc.com/Nautical-Bubble-Romper-p/s15brpnt03.htm">
<a class="productnamecolor colors_productname" href="http://www.dewewe.com/Nautical-Bubble-Romper-p/erewrwer.htm">
<a class="productnamecolor colors_productname" href="http://www.sdsddsd.com/Nautical-Bubble-Romper-p/dsadadasd.htm"> 
`

func main() {
   doc, err := mech.Parse(strings.NewReader(source))
   if err != nil {
      panic(err)
   }
   a := doc.ByTag("a")
   for a.Scan() {
      println(a.Attr("href"))
   }
}

https://pkg.go.dev/github.com/89z/mech