解组多个XML项目

时间:2016-01-10 23:26:55

标签: xml go

我正在尝试解组包含在具有相同结构的节点中的多个项目以进行进一步处理,但似乎无法访问数据,我不知道为什么。 XML数据的结构如下(我试图访问所有Item

<?xml version="1.0" encoding="ISO-8859-1" ?> 
<datainfo>
  <origin>NOAA/NOS/CO-OPS</origin>
  <producttype> Annual Tide Prediction </producttype>
  <IntervalType>High/Low Tide Predictions</IntervalType>
  <data>
    <item>
      <date>2015/12/31</date>
      <day>Thu</day>
      <time>03:21 AM</time>
      <predictions_in_ft>5.3</predictions_in_ft>
      <predictions_in_cm>162</predictions_in_cm>
      <highlow>H</highlow>
    </item>
    <item>
      <date>2015/12/31</date>
      <day>Thu</day>
      <time>09:24 AM</time>
      <predictions_in_ft>2.4</predictions_in_ft>
      <predictions_in_cm>73</predictions_in_cm>
      <highlow>L</highlow>
    </item>
  </data>
</datainfo>

我的代码是:

package main

import (
    "encoding/xml"
    "fmt"
    "io/ioutil"
    "os"
)

// TideData stores a series of tide predictions
type TideData struct {
    Tides []Tide `xml:"data>item"`
}

// Tide stores a single tide prediction
type Tide struct {
    Date         string  `xml:"date"`
    Day          string  `xml:"day"`
    Time         string  `xml:"time"`
    PredictionFt float64 `xml:"predictions_in_ft"`
    PredictionCm float64 `xml:"predictions_in_cm"`
    HighLow      string  `xml:"highlow"`
}

func (t Tide) String() string {
    return t.Date + " " + t.Day + " " + t.Time + " " + t.HighLow
}

func main() {
    xmlFile, err := os.Open("9414275 Annual.xml")
    if err != nil {
        fmt.Println("Error opening file:", err)
        return
    }
    defer xmlFile.Close()

    b, _ := ioutil.ReadAll(xmlFile)

    var tides TideData
    xml.Unmarshal(b, &tides)

    fmt.Println(tides)
    for _, datum := range tides.Tides {
        fmt.Printf("\t%s\n", datum)
    }
}

运行时输出为空,这使我认为数据未被解组。输出是:

{[]}

1 个答案:

答案 0 :(得分:5)

您忽略了xml.Unmarshal的错误回复。到slightly modifying your program,我们可以看到发生了什么:

xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil

poking around in the documentation,我们发现默认情况下,包只支持以UTF-8编码的XML:

    // CharsetReader, if non-nil, defines a function to generate
    // charset-conversion readers, converting from the provided
    // non-UTF-8 charset into UTF-8. If CharsetReader is nil or
    // returns an error, parsing stops with an error. One of the
    // the CharsetReader's result values must be non-nil.
    CharsetReader func(charset string, input io.Reader) (io.Reader, error)

所以看来你需要提供自己的字符集转换例程。您可以通过修改代码来注入它:

decoder := xml.NewDecoder(xmlFile)
decoder.CharsetReader = makeCharsetReader
err := decoder.Decode(&tides)

(请注意,我们现在正在从io.Reader而不是字节数组进行解码,因此可以删除ReadAll逻辑。 golang.org/x/text/encoding family of packages可能会帮助您实施makeCharsetReader功能。这样的事情可能有用:

import "golang.org/x/text/encoding/charmap"

func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {
    if charset == "ISO-8859-1" {
        // Windows-1252 is a superset of ISO-8859-1, so should do here
        return charmap.Windows1252.NewDecoder().Reader(input), nil
    }
    return nil, fmt.Errorf("Unknown charset: %s", charset)
}

然后您应该能够解码XML。