如何使用Haskell的xml-conduit解析GPX文件?

时间:2015-08-08 19:29:54

标签: xml haskell gpx

我想使用xml-conduit来解析GPX文件。到目前为止,我已经得到以下信息:

{-# LANGUAGE OverloadedStrings #-}

import Control.Applicative
import Data.Text           as T
import Text.XML
import Text.XML.Cursor

data Trkpt = Trkpt {
  trkptLat :: Text,
  trkptLon :: Text,
  trkptEle :: Text,
  trkptTime :: Text
  } deriving (Show)

trkptsFromFile path =
  gpxTrkpts . fromDocument <$> Text.XML.readFile def path

gpxTrkpts =
  child >=> element "{http://www.topografix.com/GPX/1/0}trk" >=>
  child >=> element "{http://www.topografix.com/GPX/1/0}trkseg" >=>
  child >=> element "{http://www.topografix.com/GPX/1/0}trkpt" >=>
  child >=> \e -> do
    let ele  = T.concat $ element "{http://www.topografix.com/GPX/1/0}ele" e >>= descendant >>= content
    let time = T.concat $ element "{http://www.topografix.com/GPX/1/0}time" e >>= descendant >>= content
    let lat  = T.concat $ attribute "lat" e
    let lon  = T.concat $ attribute "lon" e
    return $ Trkpt lat lon ele time

示例GPX文件为here

虽然原始GPX文件数据全部有效,但我得到了奇怪的结果,其中解析后的文本大多是空的,有一些零星的实际值。当存在实际值时,它仅在记录的一个字段中。

我非常确定我没有正确使用xml-conduit API。我做错了什么?

2 个答案:

答案 0 :(得分:2)

两个问题。首先,命名空间中存在拼写错误;它应该是http://www.topografix.com/GPX/1/1。其次,你的最终Kleisli箭头(\e -> do -- etc.)正在对trkpt元素的子元素进行操作,而不是trkpt自己的元素。这是gpxTrkpts应该做你想做的事情:

gpxTrkpts =
  child >=> element "{http://www.topografix.com/GPX/1/1}trk" >=>
  child >=> element "{http://www.topografix.com/GPX/1/1}trkseg" >=>
  child >=> element "{http://www.topografix.com/GPX/1/1}trkpt" >=>
  \e -> do
    let cs = child e
        ele  = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}ele" >>= descendant >>= content
        time = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}time" >>= descendant >>= content
        lat  = T.concat $ attribute "lat" e
        lon  = T.concat $ attribute "lon" e
    return $ Trkpt lat lon ele time

答案 1 :(得分:2)

@duplode指出了这个问题。以下是一些评论。

  1. 如何使用gpx-conduit package

  2. 以下是一些可以帮助调试解析问题的代码:

  3. 代码:

    {-# LANGUAGE OverloadedStrings #-}
    module Lib2 where
    
    import qualified Data.Text           as T
    import Data.Text (Text)
    import Text.XML
    import Text.XML.Cursor
    import qualified Filesystem.Path.CurrentOS as Path
    import Control.Monad
    
    showNode (NodeElement e)     = "NodeEement " ++ T.unpack (nameLocalName $ elementName e)
    showNode (NodeInstruction _) = "NodeInstruction ..."
    showNode (NodeContent t)     = "NodeContent " ++ show t
    showNode (NodeComment _)     = "NodeComment"
    
    testParser parser =  do
      content <- Text.XML.readFile def (Path.decodeString "sample.xml")
      let nodes = map node $ parser (fromDocument content)
      forM_ nodes $ \n -> putStrLn (showNode n)
    

    在ghci中使用它:

    ghci> :set -XOverloadedStrings
    ghci> :l Lib2
    Lib2> testParser child
    NodeContent "\n  "
    NodeEement metadata
    NodeContent "\n  "
    NodeEement trk
    NodeContent "\n  "
    NodeEement extensions
    NodeContent "\n"
    
    Lib2> testParser $ child >=> element "trk"
    Lib2> testParser $ child >=> laxElement "trk"
    NodeEement trk
    
    Lib2> testParser $ child >=> laxElement "trk" >=> child >=> laxElement "trkseg"
    NodeElement trkseg
    Lib2> testParser $ child >=> laxElement "trk" >=> child >=> laxElement "trkseg" >=> child >=> laxElement "trkpt"
    NodeEement trkpt
    NodeEement trkpt
    NodeEement trkpt
    NodeEement trkpt
    Lib2>