将箭头从箭头中分解出来是否表示有效的转换?

时间:2014-02-24 18:24:17

标签: haskell arrows hxt

我正试图了解HXT,一个用于解析使用箭头的XML的Haskell库。对于我的特定用例,我宁愿不使用deep,因为有些情况<outer_tag><payload_tag>value</payload_tag></outer_tag><outer_tag><inner_tag><payload_tag>value</payload_tag></inner_tag></outer_tag>不同但我遇到了一些奇怪的感觉,它应该有效,但不会

我设法根据文档中的this example提出了一个测试用例:

{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
module Main where

import Text.XML.HXT.Core

data Guest = Guest { firstName, lastName :: String }
  deriving (Show, Eq)


getGuest = deep (isElem >>> hasName "guest") >>> 
  proc x -> do
    fname <- getText <<< getChildren <<< deep (hasName "fname") -< x
    lname <- getText <<< getChildren <<< deep (hasName "lname") -< x
    returnA -< Guest { firstName = fname, lastName = lname }

getGuest' = deep (isElem >>> hasName "guest") >>> 
  proc x -> do
    fname <- getText <<< getChildren <<< (hasName "fname") <<< getChildren -< x
    lname <- getText <<< getChildren <<< (hasName "lname") <<< getChildren -< x
    returnA -< Guest { firstName = fname, lastName = lname }

getGuest'' = deep (isElem >>> hasName "guest") >>> getChildren >>>
  proc x -> do
    fname <- getText <<< getChildren <<< (hasName "fname") -< x
    lname <- getText <<< getChildren <<< (hasName "lname") -< x
    returnA -< Guest { firstName = fname, lastName = lname }


driver finalArrow = runX (readDocument [withValidate no] "guestbook.xml" >>> finalArrow)

main = do 
  guests <- driver getGuest
  print "getGuest"
  print guests

  guests' <- driver getGuest'
  print "getGuest'"
  print guests'

  guests'' <- driver getGuest''
  print "getGuest''"
  print guests''

getGuestgetGuest'之间,我将deep扩展为正确的getChildren。结果函数仍然有效。然后我将getChildren置于do块之外,但这会导致生成的函数失败。输出是:

"getGuest"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest'"
[Guest {firstName = "John", lastName = "Steinbeck"},Guest {firstName = "Henry", lastName = "Ford"},Guest {firstName = "Andrew", lastName = "Carnegie"},Guest {firstName = "Anton", lastName = "Chekhov"},Guest {firstName = "George", lastName = "Washington"},Guest {firstName = "William", lastName = "Shakespeare"},Guest {firstName = "Nathaniel", lastName = "Hawthorne"}]
"getGuest''"
[]

我觉得这应该是一个有效的转换,但我对箭的理解有点不稳定。难道我做错了什么?这是我应该报告的错误吗?

我正在使用HXT版本9.3.1.3(撰写本文时的最新版本)。 ghc --version打印“Glorious Glasgow Haskell编译系统,版本7.4.1”。我还在ghc 7.6.3的盒子上进行了测试,得到了相同的结果。

XML文件具有以下重复结构(可以找到完整文件here

<guestbook>
  <guest>
    <fname>John</fname>
    <lname>Steinbeck</lname>
  </guest>
  <guest>
    <fname>Henry</fname>
    <lname>Ford</lname>
  </guest>
  <guest>
    <fname>Andrew</fname>
    <lname>Carnegie</lname>
  </guest>
</guestbook>

2 个答案:

答案 0 :(得分:3)

getGuest''你有

... (hasName "fname") -< x
... (hasName "lname") -< x

也就是说,您限制x"fname" x"lname"的情况,但不满意任何x

答案 1 :(得分:2)

我已经设法解决了构造被解释的具体原因。找到以下箭头翻译here

提供了工作基础
addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = proc x -> do
                y <- f -< x
                z <- g -< x
                returnA -< y + z

变为:

addA :: Arrow a => a b Int -> a b Int -> a b Int
addA f g = arr (\ x -> (x, x)) >>>
           first f >>> arr (\ (y, x) -> (x, y)) >>>
           first g >>> arr (\ (z, y) -> y + z)

通过类比,我们可以从中得出:

getGuest''' = preproc >>>
           arr (\ x -> (x, x)) >>>
           first f >>> arr (\ (y, x) -> (x, y)) >>>
           first g >>> arr (\ (z, y) -> Guest {firstName = z, lastName = y})

    where preproc = deep (isElem >>> hasName "guest") >>> getChildren
        f = getText <<< getChildren <<< (hasName "fname")
        g = getText <<< getChildren <<< (hasName "lname")

在HXT中,箭头可以想象为流过滤镜的值流。正如我所希望的,arr (\x->(x,x))并没有“分流”。相反,它创建了一个由f过滤的元组流,幸存者按g过滤。由于fg是互斥的,因此没有幸存者。

getChildren里面的例子奇迹般地起作用,因为元组流包含来自XML文档的更多内容的值,如

<guest>
    <fname>John</fname>
    <lname>Steinbeck</lname>
</guest>

所以不是互相排斥的。