在Go中解析RSS feed

时间:2016-01-24 12:33:36

标签: xml go rss

我正在尝试在Go中编写播客下载器。以下代码解析RSS提要,但在将已解析的数据打印到标准输出时,通道的链接为空。我不知道为什么。有什么建议?我是Go的新手。

package main

import (
    "encoding/xml"
    "fmt"
    "net/http"
)

type Enclosure struct {
    Url    string `xml:"url,attr"`
    Length int64  `xml:"length,attr"`
    Type   string `xml:"type,attr"`
}

type Item struct {
    Title     string    `xml:"title"`
    Link      string    `xml:"link"`
    Desc      string    `xml:"description"`
    Guid      string    `xml:"guid"`
    Enclosure Enclosure `xml:"enclosure"`
    PubDate   string    `xml:"pubDate"`
}

type Channel struct {
    Title string `xml:"title"`
    Link  string `xml:"link"`
    Desc  string `xml:"description"`
    Items []Item `xml:"item"`
}

type Rss struct {
    Channel Channel `xml:"channel"`
}

func main() {
    resp, err := http.Get("http://www.bbc.co.uk/programmes/p02nrvz8/episodes/downloads.rss")
    if err != nil {
        fmt.Printf("Error GET: %v\n", err)
        return
    }
    defer resp.Body.Close()

    rss := Rss{}

    decoder := xml.NewDecoder(resp.Body)
    err = decoder.Decode(&rss)
    if err != nil {
        fmt.Printf("Error Decode: %v\n", err)
        return
    }

    fmt.Printf("Channel title: %v\n", rss.Channel.Title)
    fmt.Printf("Channel link: %v\n", rss.Channel.Link)

    for i, item := range rss.Channel.Items {
        fmt.Printf("%v. item title: %v\n", i, item.Title)
    }
}

2 个答案:

答案 0 :(得分:4)

来自RSS Feed的xml有一个带有两个子链接的频道元素'要素:'链接'和' atom:link'。即使有一个名称空间前缀,Go xml unmarshaller也会看到冲突。另请参阅local name collisions failissue on github

<?xml version="1.0" encoding="UTF-8"?>
...
   <channel>
      <title>Forum - Sixty Second Idea to Improve the World</title>
      <link>http://www.bbc.co.uk/programmes/p02nrvz8</link>
      ...
      <atom:link href="http://www.bbc.co.uk/..." />

答案 1 :(得分:0)

或者使用go-rss之类的库或informado之类的工具来读取各种RSS feed。