我需要一个在子节点上运行的函数

时间:2013-04-22 21:52:43

标签: haskell hxt

我正在研究我在这里工作的项目:

How do I make a do block return early?

我在我的函数中使用monad变换器方法看起来像这样:

scrapePost :: String -> IO ()                                                   
scrapePost url = liftM (fromMaybe ()) . runMaybeT $ do
  doc <- lift $ fromUrl url
  -- get a bunch of stuff from the page 
  -- send it to the db
  replies <- lift . runX $ doc >>> css ".post.reply"
  -- here is the problem
  mapM_ (parseReply url (fromJust page_id)) replies
  -- here is the problem

parseReply是我需要的功能,但我似乎无法做到正确。

这是我尝试启动该功能的微弱尝试:

parseReply :: String -> String -> XNode -> Maybe ()                                
parseReply url op_id reply = do                                                    
  reply_id <- runX $ reply ! "id"                                                     
  return ()                       

BTW,我正在使用HandsomeSoup

我将像scrapePost函数一样操作,并使用set css规则来删除,删除没有所有值的回复,然后将它们发送到db。

我想使用mapM,因为我希望将所有mapM替换为liftIO并查看性能差异。

[UPDATE]

所以事实证明我不需要做任何类型的杂技,我只需要一种方法将回复节点变成我找到here的根节点。

由于parseReply仅用于MaybeT IO ()上下文,因此其类型无需更改,scrapePost可以保持不变。

parseReply变为:

toRoot :: ArrowXml a => XmlTree -> a n XmlTree                                     
toRoot node = root [] [constA node]                                                

parseReply :: String -> String -> XmlTree -> MaybeT IO ()                          
parseReply url op_id reply = do                                                    
  let node = toRoot reply                                                        
  reply_id <-  lift . liftM (`atMay` 0) $ runX $ node >>> css "div" ! "id"
  guard (isJust reply_id)                                                        
  return ()

1 个答案:

答案 0 :(得分:5)

让我们来看看你的“monad runner”的类型

liftM (fromMaybe ()) . runMaybeT :: MaybeT IO () -> IO ()

因此我们需要do块的每一行都有MaybeT IO ()类型。由于mapM_ :: (a -> m b) -> [a] -> m ()我们parseReply url (fromJust page_id)需要MaybeT IO ()类型,而不仅仅是Maybe ()

幸运的是,正如人们所希望的那样,很容易将纯Maybe注入MaybeT IO

parseReplyT :: Monad m => String -> String -> XNode -> MaybeT m ()
parseReplyT url op_id = MaybeT . return . parseReply url op_id