我想使用xml-conduit
来解析GPX文件。到目前为止,我已经得到以下信息:
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import Data.Text as T
import Text.XML
import Text.XML.Cursor
data Trkpt = Trkpt {
trkptLat :: Text,
trkptLon :: Text,
trkptEle :: Text,
trkptTime :: Text
} deriving (Show)
trkptsFromFile path =
gpxTrkpts . fromDocument <$> Text.XML.readFile def path
gpxTrkpts =
child >=> element "{http://www.topografix.com/GPX/1/0}trk" >=>
child >=> element "{http://www.topografix.com/GPX/1/0}trkseg" >=>
child >=> element "{http://www.topografix.com/GPX/1/0}trkpt" >=>
child >=> \e -> do
let ele = T.concat $ element "{http://www.topografix.com/GPX/1/0}ele" e >>= descendant >>= content
let time = T.concat $ element "{http://www.topografix.com/GPX/1/0}time" e >>= descendant >>= content
let lat = T.concat $ attribute "lat" e
let lon = T.concat $ attribute "lon" e
return $ Trkpt lat lon ele time
示例GPX文件为here。
虽然原始GPX文件数据全部有效,但我得到了奇怪的结果,其中解析后的文本大多是空的,有一些零星的实际值。当存在实际值时,它仅在记录的一个字段中。
我非常确定我没有正确使用xml-conduit
API。我做错了什么?
答案 0 :(得分:2)
两个问题。首先,命名空间中存在拼写错误;它应该是http://www.topografix.com/GPX/1/1
。其次,你的最终Kleisli箭头(\e -> do -- etc.
)正在对trkpt
元素的子元素进行操作,而不是trkpt
自己的元素。这是gpxTrkpts
应该做你想做的事情:
gpxTrkpts =
child >=> element "{http://www.topografix.com/GPX/1/1}trk" >=>
child >=> element "{http://www.topografix.com/GPX/1/1}trkseg" >=>
child >=> element "{http://www.topografix.com/GPX/1/1}trkpt" >=>
\e -> do
let cs = child e
ele = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}ele" >>= descendant >>= content
time = T.concat $ cs >>= element "{http://www.topografix.com/GPX/1/1}time" >>= descendant >>= content
lat = T.concat $ attribute "lat" e
lon = T.concat $ attribute "lon" e
return $ Trkpt lat lon ele time
答案 1 :(得分:2)
@duplode指出了这个问题。以下是一些评论。
以下是一些可以帮助调试解析问题的代码:
代码:
{-# LANGUAGE OverloadedStrings #-}
module Lib2 where
import qualified Data.Text as T
import Data.Text (Text)
import Text.XML
import Text.XML.Cursor
import qualified Filesystem.Path.CurrentOS as Path
import Control.Monad
showNode (NodeElement e) = "NodeEement " ++ T.unpack (nameLocalName $ elementName e)
showNode (NodeInstruction _) = "NodeInstruction ..."
showNode (NodeContent t) = "NodeContent " ++ show t
showNode (NodeComment _) = "NodeComment"
testParser parser = do
content <- Text.XML.readFile def (Path.decodeString "sample.xml")
let nodes = map node $ parser (fromDocument content)
forM_ nodes $ \n -> putStrLn (showNode n)
在ghci中使用它:
ghci> :set -XOverloadedStrings
ghci> :l Lib2
Lib2> testParser child
NodeContent "\n "
NodeEement metadata
NodeContent "\n "
NodeEement trk
NodeContent "\n "
NodeEement extensions
NodeContent "\n"
Lib2> testParser $ child >=> element "trk"
Lib2> testParser $ child >=> laxElement "trk"
NodeEement trk
Lib2> testParser $ child >=> laxElement "trk" >=> child >=> laxElement "trkseg"
NodeElement trkseg
Lib2> testParser $ child >=> laxElement "trk" >=> child >=> laxElement "trkseg" >=> child >=> laxElement "trkpt"
NodeEement trkpt
NodeEement trkpt
NodeEement trkpt
NodeEement trkpt
Lib2>