Haskell软件包hxt的使用对我来说仍然有点奇怪。尤其是箭头符号和结果类型是一种魔力。
到目前为止我无法管理以下内容:我想处理一个主要包含两部分的XML文件。一个保持对象的定义,第二个保持对象的用途/目的。首先,我想编写一些hxt处理来获取part1上的Haskell数据结构,在该过程第2部分之后,最后将两个数据结构组合在程序的实际逻辑中。
一般来说,处理文件现在很好,谢谢the arrows tutorial。但是我现在想要做三个步骤:读取文档(懒惰),用第一个处理器处理结果结构一次,然后再用第二个处理器处理相同的结构。我不想要的是两次调用“readDocument”,如下例所示。
import Text.XML.HXT.Core
import Data.Char(toUpper)
import Data.Tree.NTree.TypeDefs
play filename = do
results <- runX (getAllAddresses filename)
results2 <- runX (getAllAddressesUsages filename)
print results
print results2
getAllAddresses :: FilePath -> IOSArrow XmlTree [(String,NTree XNode)]
getAllAddresses filename =
readDocument [withValidate no] filename >>>
getChildren >>>
isElem >>> hasName "main" >>>
getChildren >>>
isElem >>> hasName "part1" >>>
getChildren >>>
isElem >>> hasName "address" >>>
listA(getAddress) -- create a list for each variable, so use listA
getAddress :: IOSArrow XmlTree (String,NTree XNode)
getAddress =
getChildren >>>
isElem >>>
(
neg ( hasName "location") >>> -- all elements being no "location"
getName &&& (getChildren) -- get the name and the value for each element
)
<+>
(
hasName "location" >>> -- work on all nodes within the "location" subcontainer
getChildren >>>
isElem >>>
( getName &&& (getChildren) ) -- get the name and the value for each element
)
getAllAddressesUsages :: FilePath -> IOSArrow XmlTree [(String,NTree XNode)]
getAllAddressesUsages filename =
readDocument [withValidate no] filename >>>
getChildren >>>
isElem >>> hasName "main" >>>
getChildren >>>
isElem >>> hasName "part2" >>>
getChildren >>>
listA(getAddressUsagePurpose2) -- create a list for each variable, so use listA
getAddressUsagePurpose2 :: IOSArrow XmlTree (String,NTree XNode)
getAddressUsagePurpose2 =
hasName "use_obj-names_for_purpose_2" >>> -- work on all nodes with usage 2
( getName &&& (getChildren) ) -- get the name and the value for each element
示例数据:
<main>
<part1>
<address>
<obj-name>one</obj-name>
<name>peter 1</name>
<street>streetname 1</street>
<location>
<country>Germany</country>
<state>Baden Wuerttemberg</state>
</location>
</address>
<address>
<obj-name>two</obj-name>
<name>peter 2</name>
<street>streetname 2</street>
<location>
<country>Germany</country>
<state>Nordrhein Westfalen</state>
</location>
</address>
</part1>
<part2>
<use_obj-names_for_purpose_1>
<obj-name>two</obj-name>
</use_obj-names_for_purpose_1>
<use_obj-names_for_purpose_2>
<obj-name>two</obj-name>
</use_obj-names_for_purpose_2>
</part2>
</main>
所以正式的问题是:
为了得到这样的东西,monadic在函数游戏中的表现如何:
readXmlDocument :: String -> IOSArrow XmlTree (NTree XNode)
readXmlDocument filename = readDocument [withValidate no] filename
play filename = do
document <- readXmlDocument filename
allAddresses <- getAllAddresses document
allPurposes <- getAllAddressesUsages document
result <- processLogics allAddresses allPurposes
print result
如何从Monads转到Arrows,返回Monads,再转到普通数据并返回Monads。
为什么我这样做?
答案 0 :(得分:1)
问题的一个解决方案如下:
使用箭头语言扩展并使用“proc”表达式处理在两个处理器路径中的一个函数中读取的文档。结果组合在一个元组中。这个元组仍然包含两个需要运行的箭头。这是通过runX函数的两个应用程序完成的。
一旦机器人结果在下面的计算中合并,我仍然不确切知道该构造是否加载了一两次文件。
{-# LANGUAGE Arrows #-}
import Text.XML.HXT.Core
import Data.Char(toUpper)
import Data.Tree.NTree.TypeDefs
play filename = (runX addresses, runX usages)
where (addresses,usages)=(analyseXml (readXmlDocument filename))
analyseXml :: IOSArrow XmlTree (NTree XNode) -> (IOSArrow XmlTree [(String,NTree XNode)],IOSArrow XmlTree String)
analyseXml = proc document -> do
allAddresses <- getAllAddresses -< document
allUsages <- getAllAddressesUsages -< document
returnA -< (allAddresses,allUsages)
readXmlDocument :: String -> IOSArrow XmlTree (NTree XNode)
readXmlDocument filename = readDocument [withValidate no] filename
getAllAddresses :: IOSArrow XmlTree (NTree XNode) -> IOSArrow XmlTree [(String,NTree XNode)]
getAllAddresses document =
document >>>
getChildren >>>
isElem >>> hasName "main" >>>
getChildren >>>
isElem >>> hasName "part1" >>>
getChildren >>>
isElem >>> hasName "address" >>>
listA(getAddress) -- create a list for each variable, so use listA
getAddress :: IOSArrow XmlTree (String,NTree XNode)
getAddress =
getChildren >>>
isElem >>>
(
neg ( hasName "location") >>> -- all elements being no "location"
getName &&& (getChildren) -- get the name and the value for each element
)
<+>
(
hasName "location" >>> -- work on all nodes within the "location" subcontainer
getChildren >>>
isElem >>>
( getName &&& (getChildren) ) -- get the name and the value for each element
)
getAllAddressesUsages :: IOSArrow XmlTree (NTree XNode) -> IOSArrow XmlTree String
getAllAddressesUsages document =
document >>>
getChildren >>>
isElem >>> hasName "main" >>>
getChildren >>>
isElem >>> hasName "part2" >>>
getChildren >>>
isElem >>> hasName "use_obj-names_for_purpose_2" >>>
getChildren >>>
isElem >>> hasName "obj-name" >>>
getChildren >>>
getText -- create a list with objects for each short-name. So use listA
执行可以按如下方式进行:
*Main> snd ( play "../tmp/haskell/test.xml")
["two"]
*Main> fst ( play "../tmp/haskell/test.xml")
[[("obj-name",NTree (XText "one") []),("name",NTree (XText "peter 1") []),("street",NTree (XText "streetname 1") []),("country",NTree (XText "Germany") []),("state",NTree (XText "Baden Wuerttemberg") [])],[("obj-name",NTree (XText "two") []),("name",NTree (XText "peter 2") []),("street",NTree (XText "streetname 2") []),("country",NTree (XText "Germany") []),("state",NTree (XText "Nordrhein Westfalen") [])]]
*Main>