用于构建测试数据的monad

时间:2015-08-23 09:35:39

标签: haskell monads

好的,所以我正在尝试编写一个用于构建测试数据的monad,但我无法按照我想要的方式工作。它看起来像这样:

runBuildM :: [i] -> BuildM i o x -> [o]
-- Given a list of i, build a list of o.

source :: BuildM i o i
-- Fetch unique i.

yield :: o -> BuildM i o ()
-- Return a new o to the caller.

gather :: BuildM i o x -> BuildM i o o
-- Fetch every possible o from sub-computation.

local :: BuildM i o x -> BuildM i o x
-- Isolate any source invocations from the rest of the code.

换句话说,它是一个供应单子,作家单子和列表单子。我的想法是我可以这样写:

build_tests depth = do
  local $ do
    v <- source
    yield v
    yield (map toLower v)
  yield "[]"
  yield "()"
  when (depth > 2) $ do
    t1 <- gather $ build_tests (depth-1)
    yield $ "(" ++ t1 ++ ")"
    yield $ "[" ++ t1 ++ "]"
    t2 <- gather $ build_tests (depth-1)
    yield $ "(" ++ t1 ++ "," ++ t2 ++ ")"

这个想法是生成所有可能的数据组合。你可以使用列表推导来做到这一点,但结果在语法上很糟糕。这很多更具可读性。不幸的是,它实际上并不是工作 ......

问题似乎归结为local函数运行不正常。 意图适用于子计算中的任何source次调用,不会对其产生任何影响。 (即,从source块之外的local后续调用会再次获得第一个令牌。)但是,local 实际实现的操作是重置所有的下一个标记(即包括子计算的内容)。这显然是不正确的,但我不能因为我的生活而改变我的想法如何让它正常工作。

我在使代码按要求工作时遇到这个问题可能意味着我的monad的实际内部表示只是错误。任何人都可以正确地实施这个吗?

编辑:我应该已经意识到这一点,但我实际上没有指定我想要获得的预期结果。上面的代码应该产生这个:

["A", "a", "[]", "()", "(A)", "(a)", "[A]", "[a]", "(A, B)", "(A, b)", "(a, B)", "(a, b)"]

结果恰好按此顺序显示并非超级关键。我希望单个案例出现在复合案例之前,但我并不太确定化合物出现的顺序。规则是相同的变量在任何单个表达式中都不会出现两次。

如果我们允许深度更深一些,我们还会获得诸如

之类的术语
"((A))", "([A])", "[(A)]", "((A, B), C)", "(A, (B, C))"

等等。

它显然已被打破,但这是我到目前为止所拥有的:

newtype BuildM i o x = BuildM ([i] -> SEQ.Seq ([i], SEQ.Seq o, x))

instance Functor (BuildM i o) where
  fmap uf (BuildM sf) =
    BuildM $ \ is0 -> do
      (is1, os, x) <- sf is0
      return (is1, os, uf x)

instance Applicative (BuildM i o) where
  pure x = BuildM $ \ is0 -> return (is0, SEQ.empty, x)

  BuildM sf1 <*> BuildM sf2 =
    BuildM $ \ is1 -> do
      (is2, os2, f) <- sf1 is1
      (is3, os3, x) <- sf2 is2
      return (is3, os2 >< os3, f x)

instance Monad (BuildM i o) where
  return = pure

  BuildM sf1 >>= uf =
    BuildM $ \ is1 -> do
      (is2, os2, x) <- sf1 is1
      let BuildM sf2 = uf x
      (is3, os3, y) <- sf2 is2
      return (is3, os2 >< os3, y)

runBuildM :: [i] -> BuildM i o x -> [o]
runBuildM is0 (BuildM sf) =
  toList $ do
    (is, os, x) <- sf is0
    os

source :: BuildM i o i
source =
  BuildM $ \ is ->
    if null is
      then error "AHC.Tests.TestBuilder.source: end of input"
      else return (tail is, SEQ.empty, head is)

yield :: o -> BuildM i o ()
yield o = BuildM $ \ is -> return (is, SEQ.singleton o, () )

gather :: BuildM i o x -> BuildM i o' o
gather (BuildM sf1) =
  BuildM $ \ is1 -> do
    (is2, os2, _) <- sf1 is1
    o <- os2
    return (is2, SEQ.empty, o)

local :: BuildM i o x -> BuildM i o ()
local (BuildM sf1) =
  BuildM $ \ is1 ->
    let os = do (is2, os2, x) <- sf1 is1; os2
    in  return (is1, os, () )

3 个答案:

答案 0 :(得分:3)

您正在尝试重新发明pipes。您的sourceyield是管道awaityield。您尝试处理的另外两个问题分别是ReaderTWriterT。如果将整个输入列表放在ReaderT的环境中,则可以运行在列表开头重新开始的local子计算。您可以通过添加WriterT图层来收集输出,从子计算中收集所有结果。

对于gather的漂亮语法,您尝试重新创建ListT

管道,读者和作家

我们将在很短的时间内使用以下所有内容。

import Data.Functor.Identity
import Data.Foldable

import Control.Monad
import Control.Monad.Morph
import Control.Monad.IO.Class
import Control.Monad.Trans.Class
import Control.Monad.Trans.Reader hiding (local)
import Control.Monad.Trans.Writer.Strict

import Pipes.Core
import Pipes ((>->))
import qualified Pipes as P
import qualified Pipes.Prelude as P

import Pipes.Lift (runWriterP, runReaderP)

您的构建器是Pipe i o超过Reader [i],允许您在输入的开头重置。我们将定义它的两个版本,BuildT是monad变换器,BuildM是monad。 BuildM是jsut变换器应用于Identity

type BuildT e i o m r = Pipe i o (ReaderT e m) r
type BuildM e i o   r = BuildT e i o Identity r

local运行一个构建器,为其提供从环境读取的整个输入。我们可能希望为此设置一个不同的名称,以避免与local

定义的ReaderT冲突
local :: (Monad m, Foldable f) => BuildT (f i) i o m () -> Proxy a' a () o (ReaderT (f i) m) ()
local subDef = do
    e <- lift ask
    hoist lift $ runReaderP e $
        P.each e >->
        subDef

为了收集子计算的结果,我们利用管道非常纯净的事实,如果你有自然变换forall x. m x -> n x,你可以换掉底层monad。来自管道的代理具有MFunctor实例,该实例提供函数hoist :: (forall x. m x -> n x) -> Proxy a' a b' b m r -> Proxy a' a b' b n r;它允许我们解除管道下的所有底层monad操作,将管道用于另一个变换器,在本例中为WriterT

collect :: (Monad m) => Proxy a' a () b m r -> Proxy a' a c' c m ([b], r)
collect subDef = do
    (r, w) <- runWriterP $
        hoist lift subDef //> \x -> lift $ tell (++[x])
    return (w [], r)

要运行构建器,我们会为其提供环境中的所有输入,提供初始环境,收集结果并运行整个管道。

runBuildT :: (Monad m) => [i] -> BuildT [i] i o m () -> m [o]
runBuildT e = runEffect . fmap fst . collect . runReaderP e . local

运行monad而不是变换器只是

runBuildM :: [i] -> BuildM [i] i o () -> [o]
runBuildM e = runIdentity . runBuildT e

ListT

本节允许我们在生成所有事物组合时使用do - 符号。这相当于使用管道“for代替每个>>=yield来代替每个return

gather来自子计算的所有结果的语法是重新发明ListTListT m a保存Producer a m ()只返回下游数据。从上游获取数据并向下游返回数据的管道不适合Producer b m ()。这需要一些转换。

我们可以将同时具有上游和下游接口的Proxy转换为仅具有上游接口的另一个代理的下游接口的request。为此,我们将底层monad提升到新的内部上游代理中,然后用从内部上游代理提升的request替换外部下游代理中的所有floatRespond :: (Monad m) => Proxy a' a b' b m r -> Proxy c' c b' b (Proxy a' a d' d m) r floatRespond = (lift . request >\\) . hoist lift

ListT

这些可以转换为gather :: (Monad m) => Proxy a' a () b m r -> P.ListT (Proxy a' a c' c m) b gather = P.Select . floatRespond . (>>= return . const ()) 。我们将丢弃任何返回的数据以获得更多的多态类型。

ListT

使用mplus使用起来有点麻烦;您需要在return之间ListT才能获得两个输出。将代理推送到lift . yield通常很方便,因此您可以return代替ListT。我们将放弃所有lift . yield.结果,依赖于来自just runs a枚举enumerate = P.runListT ListT`的输出,包含所有结果

source

实施例

我们现在有能力编写并运行您的示例。我的意思是yield从源获取一个值,而source = P.await yield = P.yield 返回一个值。如果您不需要一次获得一个值,那么您的问题就会被过度指定,而且这个答案是过度的。

gather

在我们使用enumerate构建列表的示例中,我们使用lift . yield运行代码的这一部分,并使用import Data.Char build_tests :: Monad m => Int -> BuildT [String] String String m () build_tests depth = do local $ do v <- source yield $ v yield $ (map toLower v) yield "[]" yield "()" when (depth > 2) $ enumerate $ do t1 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ ")" lift . yield $ "[" ++ t1 ++ "]" t2 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ "," ++ t2 ++ ")" 生成结果。

["A", "B"]

如果我们使用输入source运行此示例,则永远不会使用“B”输入,因为local仅在main = do putStrLn "Depth 2" print =<< runBuildT ["A", "B"] (build_tests 2) putStrLn "Depth 3" print =<< runBuildT ["A", "B"] (build_tests 3) 内使用过一次。

["A","a","[]","()"]
Depth 3
["A","a","[]","()","(A)","[A]","(A,A)","(A,a)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,[])","(a,())","([])","[[]]","([],A)","([],a)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),[])","((),())"]

深度小于4的输出小到足以在此重复。

source

这可能有点矫枉过正

我怀疑你可能意味着source = gather P.cat yield = P.yield 从源头获取所有内容。

enumerate

如果我们将此用作示例,而不是从源代码中获取单个项目,我们将local阻止第一个return块,并通过ListT build_tests :: Monad m => Int -> BuildT [String] String String m () build_tests depth = do local $ enumerate $ do v <- source lift . yield $ v lift . yield $ (map toLower v) yield "[]" yield "()" when (depth > 2) $ enumerate $ do t1 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ ")" lift . yield $ "[" ++ t1 ++ "]" t2 <- gather $ build_tests (depth-1) lift . yield $ "(" ++ t1 ++ "," ++ t2 ++ ")" 生成结果}。

Depth 2
["A","a","B","b","[]","()"]
Depth 3
["A","a","B","b","[]","()","(A)","[A]","(A,A)","(A,a)","(A,B)","(A,b)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,B)","(a,b)","(a,[])","(a,())","(B)","[B]","(B,A)","(B,a)","(B,B)","(B,b)","(B,[])","(B,())","(b)","[b]","(b,A)","(b,a)","(b,B)","(b,b)","(b,[])","(b,())","([])","[[]]","([],A)","([],a)","([],B)","([],b)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),B)","((),b)","((),[])","((),())"]

当我们运行带有两个源的示例时,这将使用两个源值。

ListT (ReaderT [i] m) o

如果您从未从源获取单个值,则可以使用mplus代替。您可能仍然需要代理以避免弄乱{{1}}。

答案 1 :(得分:2)

如果my other answer过度,则延续monad变换器提供了构造任何MonadPlus值的便捷方法。

延续monad让我们可以轻松捕捉到做出mplus尚未知的余数的想法。

import Control.Monad
import Control.Monad.Trans.Cont

once :: MonadPlus m => m a -> ContT a m ()
once m = ContT $ \k -> m `mplus` k ()

产生一个结果只是返回一次。

yield :: MonadPlus m => a -> ContT a m ()
yield = once . return

我们可以通过在最后粘贴mzero来收集所有结果。

gather :: MonadPlus m => ContT a m r -> m a
gather m = runContT m (const mzero)

您的示例是根据yieldgatheroncelift编写的。

import Data.Char

import Control.Monad.Trans.Class

build_tests :: MonadPlus m => m String -> Int -> ContT String m ()
build_tests source = go
  where
    go depth = do
      once . gather $ do
        v <- lift source
        yield v
        yield (map toLower v)
      yield "[]"
      yield "()"
      when (depth > 2) $ do
        t1 <- lift . gather $ go (depth-1)
        yield $ "(" ++ t1 ++ ")"
        yield $ "[" ++ t1 ++ "]"
        t2 <- lift . gather $ go (depth-1)
        yield $ "(" ++ t1 ++ "," ++ t2 ++ ")"

main = print . gather $ build_tests ["A", "B"] 3

这输出以下内容:

Depth 2
["A","a","B","b","[]","()"]
Depth 3
["A","a","B","b","[]","()","(A)","[A]","(A,A)","(A,a)","(A,B)","(A,b)","(A,[])","(A,())","(a)","[a]","(a,A)","(a,a)","(a,B)","(a,b)","(a,[])","(a,())","(B)","[B]","(B,A)","(B,a)","(B,B)","(B,b)","(B,[])","(B,())","(b)","[b]","(b,A)","(b,a)","(b,B)","(b,b)","(b,[])","(b,())","([])","[[]]","([],A)","([],a)","([],B)","([],b)","([],[])","([],())","(())","[()]","((),A)","((),a)","((),B)","((),b)","((),[])","((),())"]

为了简单起见,我冒昧地摆脱了从环境中读取原始资源的要求。您可以将ReaderT添加到变换器堆栈以将其恢复。我也没有为你选择一个列表变换器,例子是使用普通的列表monad运行。由于它是以MonadPlus编写的,因此它也适用于任何(MonadTrans t, MonadPlus (t m)) => t m

答案 2 :(得分:2)

您正在尝试重新发明pipes和一些nice syntax for building lists。这个问题比你描述它的方式简单得多。字符串的来源可以与构建结构完全分开。

您希望生成从某个源绘制符号的结构。不用担心来源,让我们构建结构。每个结构都是一个Pipe,它将从一些源和yield字符串中抽取来连接在一起构建表达式。

import Data.Char

import Data.Functor.Identity

import Pipes.Core
import Pipes ((>->))
import qualified Pipes as P
import qualified Pipes.Prelude as P

build_structures :: Int -> [Pipe String String Identity ()]
build_structures depth = gather $ do
    yield $ P.take 1
    yield $ P.map (map toLower) >-> P.take 1
    when (depth > 2) $ do
        t1 <- lift $ build_structures (depth - 1)
        yield $ P.yield "(" >> t1 >> P.yield ")"
        yield $ P.yield "[" >> t1 >> P.yield "]"
        t2 <- lift $ build_structures (depth - 1)
        yield $ P.yield "(" >> t1 >> P.yield "," >> t2 >> P.yield ")"

此代码使用延续答案中的ContT yield trick

我们通过输入符号并连接结果来运行其中一个结构。

run :: Pipe String String Identity () -> String
run p = concat . P.toList $ P.each symbols >-> p

-- an infinite source of unique symbols
symbols :: [String]
symbols = drop 1 symbols'
    where
        symbols' = [""] ++ do
            tail <- symbols'
            first <- ['A'..'Z']
            return (first : tail)

示例生成所需的字符串。我将生成两个特殊情况"[]""()",它们不会以递归术语出现,作为练习。

import Data.Functor

main = do
    putStrLn "Depth 2"
    print $ run <$> build_structures 2
    putStrLn "Depth 3"
    print $ run <$> build_structures 3
    putStrLn "Depth 4"
    print $ run <$> build_structures 4

这导致

Depth 2
["A","a"]
Depth 3
["A","a","(A)","[A]","(A,B)","(A,b)","(a)","[a]","(a,B)","(a,b)"]
Depth 4
["A","a","(A)","[A]","(A,B)","(A,b)","(A,(B))","(A,[B])","(A,(B,C))","(A,(B,c))","(A,(b))","(A,[b])","(A,(b,C))","(A,(b,c))","(a)","[a]","(a,B)","(a,b)","(a,(B))","(a,[B])","(a,(B,C))","(a,(B,c))","(a,(b))","(a,[b])",...