使用递归方案进行表达式扩展

时间:2017-03-16 03:33:23

标签: haskell recursion-schemes

我有一个表示算术表达式的数据类型:

data E = Add E E | Mul E E | Var String

我想编写一个扩展函数,它将表达式转换为变量乘积之和(大括号扩展)。当然使用递归方案。

我只能按照“进步和保存”的精神来思考算法。每个步骤的算法构造完全展开的术语,因此无需重新检查。

Mul的处理让我发疯,所以我没有直接使用同形类型的[[String]],而是利用已经实现的concatconcatMap我:

type Poly = [Mono]
type Mono = [String]

mulMonoBy :: Mono -> Poly -> Poly
mulMonoBy x = map (x ++)

mulPoly :: Poly -> Poly -> Poly
mulPoly x = concatMap (flip mulMonoBy x)

那么我只使用cata

expandList :: E -> Poly
expandList = cata $ \case
   Var x -> [[x]]
   Add e1 e2 = e1 ++ e2
   Mul e1 e2 = mulPoly e1 e2

转换回来:

fromPoly :: Poly -> Expr
fromPoly = foldr1 Add . map fromMono where
   fromMono = foldr1 Mul . map Var

是否有明显更好的方法?

更新:很少有混淆。

  1. 该解决方案允许多行变量名称。 Add (Val "foo" (Mul (Val "foo) (Var "bar")))代表foo + foo * bar。我没有用x*y*z或其他东西代表Val "xyz"。请注意,因为没有标量重复变量,例如" foo * foo * quux"是完全允许的。

  2. 通过产品的总和,我的意思是"咖喱" n-ary产品总和。产品总和的简明定义是我想要一个没有任何括号的表达式,所有的parens都由关联性和优先级表示。

  3. 所以(foo * bar + bar) + (foo * bar + bar)不是产品的总和,因为中间+是总和的总和

    (foo * bar + (bar + (foo * bar + bar)))或相应的左关联版本是正确答案,但我们必须保证关联性始终保持正确。因此,右对偶解决方案的正确类型是

    data Poly = Sum Mono Poly
              | Product Mono
    

    与非空列表同构:NonEmpty Poly(注意Sum Mono Poly而不是Sum Poly Poly)。如果我们允许空金额或产品,那么我们只得到我使用的列表表示列表。

    1. 你也不关心表现,乘法似乎只是liftA2 (++)

2 个答案:

答案 0 :(得分:1)

我不是递归方案的专家,但是因为听起来你正在尝试练习它们,所以希望你不会觉得将使用手动递归的解决方案转换为使用递归方案的解决方案过于繁琐。我将首先用混合散文和代码编写它,并在最后再次包含完整代码,以便更简单地复制/粘贴。

使用简单的属性和一些递归代数来做起来并不困难。然而,在我们开始之前,让我们定义一个更好的结果类型,保证我们只能代表产品的总和:

data Poly term = Sum (Poly term) (Poly term)
               | Product (Mono term) 
               deriving Show

data Mono term = Term term
               | MonoMul (Mono term) (Mono term)
               deriving Show

这样我们就不会搞砸并意外地产生错误的结果,如

(Mul (Var "x") (Add (Var "y") (Var "z")))

现在,让我们写下我们的功能。

expand :: E -> Poly String

首先,一个基本案例:扩展Var是微不足道的,因为它已经是产品总和形式。但我们必须将它转换为适合我们的Poly结果类型:

expand (Var x) = Product (Term x)

接下来,请注意扩展添加很容易:只需展开两个子表达式,然后将它们一起添加。

expand (Add x y) = Sum (expand x) (expand y)

乘法怎么样?这有点复杂,因为

Product (expand x) (expand y)

是错误的类型:我们不能将多项式乘以多项式。但我们确实知道如何通过分配规则来进行代数操作,将多项式的乘法转换为单项式乘法的和。在您的问题中,我们需要一个函数mulPoly。但是,让我们假设它存在,并在以后实现它。

expand (Mul x y) = mulPoly (expand x) (expand y)

处理所有情况,所以剩下的就是通过在两个多项式上分布乘法来实现mulPoly。条款。我们一次只分解一个多项式中的一个多项式,并将该项乘以另一个多项式中的每个项,将结果加在一起。

mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)

mulMonoBy :: Mono String -> Poly -> Poly
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
  where x' = Product x

最后,我们可以测试它是否按预期工作:

expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))
{- results in: Sum (Sum (Product (MonoMul (Term "y") (Term "a"))) 
                        (Product (MonoMul (Term "z") (Term "a")))) 
                   (Sum (Product (MonoMul (Term "y") (Term "b"))) 
                        (Product (MonoMul (Term "z") (Term "b"))))
-}

或者,

(a + b)(y * z) = ay + az + by + bz

我们知道这是正确的。

完整的解决方案,如上所述

data E = Add E E | Mul E E | Var String

data Poly term = Sum (Poly term) (Poly term)
               | Product (Mono term) 
               deriving Show

data Mono term = Term term
               | MonoMul (Mono term) (Mono term)
               deriving Show

expand :: E -> Poly String
expand (Var x) = Product (Term x)
expand (Add x y) = Sum (expand x) (expand y)
expand (Mul x y) = mulPoly (expand x) (expand y)

mulPoly :: Poly String -> Poly String -> Poly String
mulPoly (Product x) y = mulMonoBy x y
mulPoly (Sum a b) x = Sum (mulPoly a x) (mulPoly b x)

mulMonoBy :: Mono String -> Poly String -> Poly String
mulMonoBy x (Product y) = Product $ MonoMul x y
mulMonoBy x (Sum a b) = Sum (mulPoly a x') (mulPoly b x')
  where x' = Product x

main = print $ expand (Mul (Add (Var "a") (Var "b")) (Add (Var "y") (Var "z")))

答案 1 :(得分:1)

这个答案有三个部分。第一部分是我提出两个最喜欢的解决方案的摘要,是最重要的部分。第二部分包含类型和导入,以及对解决方案的扩展评论。第三部分重点介绍重新关联表达式的任务,即答案的原始版本(即第二部分)没有得到应有的重视。

在一天结束时,我最终得到了两个值得讨论的解决方案。第一个是expandDirect(参见第三部分):

expandDirect :: E a -> E a
expandDirect = cata alg
    where
    alg = \case
        Var' s -> Var s
        Add' x y -> apo coalgAdd (Add x y)
        Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y)
    coalgAdd = \case
        Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
        x -> Left <$> project x
    coalgAdd' = \case
        Add (Add x x') y -> Add' (Left x) (Right (Add x' y))
        Add x (Add y y') -> Add' (Left x) (Right (Add y y'))
        x -> Left <$> project x
    coalgMul = \case
        Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y))
        Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y'))
        x -> Left <$> project x

有了它,我们从底部重建树(cata)。在每个分支上,如果我们发现某些内容无效,我们会返回并重新编写子树(apo),根据需要重新分配和重新关联,直到所有直接的孩子都被正确安排(apo使得可以在没有把每一个都改写到最底层。)

第二个解决方案expandMeta是第三部分中expandFlat的简化版本。

expandMeta :: E a -> E a
expandMeta = apo coalg . cata alg
    where
    alg = \case
        Var' s -> pure (Var s)
        Add' x y -> x <> y
        Mul' x y -> Mul <$> x <*> y
    coalg = \case
        x :| [] -> Left <$> project x
        x :| (y:ys) -> Add' (Left x) (Right (y :| ys))

expandMeta是一种变质作用;也就是说,一个变形后跟一个变形现象(当我们在这里使用apo时,一个同态也只是一种奇特的变形现象,所以我猜这个命名法仍然适用)。 catamorphism将树更改为非空列表 - 隐式处理Add s的重新关联 - 使用列表applicative来分配乘法(很像你建议)。然后,余代数非常简单地将非空列表转换回具有适当形状的树。

感谢您提出的问题 - 我玩得很开心!预赛:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}

import Data.Functor.Foldable
import qualified Data.List.NonEmpty as N
import Data.List.NonEmpty (NonEmpty(..))
import Data.Semigroup
import Data.Foldable (toList)
import Data.List (nub)
import qualified Data.Map as M
import Data.Map (Map, (!))
import Test.QuickCheck

data E a = Var a | Add (E a) (E a) | Mul (E a) (E a)
    deriving (Eq, Show, Functor, Foldable)

data EF a b = Var' a | Add' b b | Mul' b b
    deriving (Eq, Show, Functor)

type instance Base (E a) = EF a

instance Recursive (E a) where
    project = \case
        Var x -> Var' x
        Add x y -> Add' x y
        Mul x y -> Mul' x y

instance Corecursive (E a) where
    embed = \case
        Var' x -> Var x
        Add' x y -> Add x y
        Mul' x y -> Mul x y

首先,我的第一个工作(如果有缺陷)尝试,它使用(非空)列表的应用实例来分发:

expandTooClever :: E a -> E a
expandTooClever = cata $ \case
    Var' s -> Var s
    Add' x y -> Add x y
    Mul' x y -> foldr1 Add (Mul <$> flatten x <*> flatten y)
    where
    flatten :: E a -> NonEmpty (E a)
    flatten = cata $ \case
        Var' s -> pure (Var s)
        Add' x y -> x <> y
        Mul' x y -> pure (foldr1 Mul (x <> y))

expandTooClever有一个相对严重的问题:因为它调用flatten,这是一个完整的折叠,对于两个子树,只要它达到Mul,它就会有{的链的可怕渐近线{1}}。

蛮力,最简单的东西,可能是可行的解决方案,代数以递归方式调用自己:

Mul

需要递归调用,因为分发可能会在expandBrute :: E a -> E a expandBrute = cata alg where alg = \case Var' s -> Var s Add' x y -> Add x y Mul' (Add x x') y -> Add (alg (Mul' x y)) (alg (Mul' x' y)) Mul' x (Add y y') -> Add (alg (Mul' x y)) (alg (Mul' x y')) Mul' x y -> Mul x y 下引入新的Add

Mul稍微更有品味的变体,递归调用被分解为一个单独的函数:

expandBrute

驯服的expandNotSoBrute :: E a -> E a expandNotSoBrute = cata alg where alg = \case Var' s -> Var s Add' x y -> Add x y Mul' x y -> dis x y dis (Add x x') y = Add (dis x y) (dis x' y) dis x (Add y y') = Add (dis x y) (dis x y') dis x y = Mul x y expandNotSoBrute变成了一个同态。这种措辞方式很好地表达了正在发生的事情的大局:如果你只有disVar,你可以在世界上无需小心地自由地重现树;但是,如果您点击Add,则必须返回并重新构建整个子树以执行分发(我想知道是否存在捕获此模式的专门递归方案)。

Mul

expandEvert :: E a -> E a expandEvert = cata alg where alg :: EF a (E a) -> E a alg = \case Var' s -> Var s Add' x y -> Add x y Mul' x y -> apo coalg (x, y) coalg :: (E a, E a) -> EF a (Either (E a) (E a, E a)) coalg (Add x x', y) = Add' (Right (x, y)) (Right (x', y)) coalg (x, Add y y') = Add' (Right (x, y)) (Right (x, y')) coalg (x, y) = Mul' (Left x) (Left y) 是必要的,因为如果没有其他任何内容可以分发,我们希望预测最终结果。 (有一种方法可以使用apo来编写它;但是,这需要浪费地重建ana s的树而不进行更改,这会导致Mul具有相同的渐近问题。)

最后但并非最不重要的是,这个解决方案既成功实现了我expandTooClever的尝试,又解释了amalloy's answerexpandTooClever是一个花园种类的二叉树,叶子上有值。产品由BT表示,而产品的总和是树木。

BT a

expandSOP :: E a -> E a expandSOP = cata algS . fmap (cata algP) . cata algSOP where algSOP :: EF a (BT (BT a)) -> BT (BT a) algSOP = \case Var' s -> pure (pure s) Add' x y -> x <> y Mul' x y -> (<>) <$> x <*> y algP :: BTF a (E a) -> E a algP = \case Leaf' s -> Var s Branch' x y -> Mul x y algS :: BTF (E a) (E a) -> E a algS = \case Leaf' x -> x Branch' x y -> Add x y 及其实例:

BT

总结一下,测试套件:

data BT a = Leaf a | Branch (BT a) (BT a)
    deriving (Eq, Show)

data BTF a b = Leaf' a | Branch' b b
    deriving (Eq, Show, Functor)

type instance Base (BT a) = BTF a

instance Recursive (BT a) where
    project (Leaf s) = Leaf' s
    project (Branch l r) = Branch' l r

instance Corecursive (BT a) where
    embed (Leaf' s) = Leaf s
    embed (Branch' l r) = Branch l r

instance Semigroup (BT a) where
    l <> r = Branch l r

-- Writing this, as opposed to deriving it, for the sake of illustration.
instance Functor BT where
    fmap f = cata $ \case
        Leaf' x -> Leaf (f x)
        Branch' l r -> Branch l r

instance Applicative BT where
    pure x = Leaf x
    u <*> v = ana coalg (u, v)
        where
        coalg = \case
            (Leaf f, Leaf x) -> Leaf' (f x)
            (Leaf f, Branch xl xr) -> Branch' (Leaf f, xl) (Leaf f, xr)
            (Branch fl fr, v) -> Branch' (fl, v) (fr, v)
  

通过产品的总和,我的意思是&#34;咖喱&#34; n-ary产品总和。产品总和的简明定义是我想要一个没有任何括号的表达式,所有的parens都由关联性和优先级表示。

我们可以调整上面的解决方案,以便重新关联总和。最简单的方法是使用newtype TestE = TestE { getTestE :: E Char } deriving (Eq, Show) instance Arbitrary TestE where arbitrary = TestE <$> sized genExpr where genVar = Var <$> choose ('a', 'z') genAdd n = Add <$> genSub n <*> genSub n genMul n = Mul <$> genSub n <*> genSub n genSub n = genExpr (n `div` 2) genExpr = \case 0 -> genVar n -> oneof [genVar, genAdd n, genMul n] data TestRig b = TestRig (Map Char b) (E Char) deriving (Show) instance Arbitrary b => Arbitrary (TestRig b) where arbitrary = do e <- genExpr d <- genDict e return (TestRig d e) where genExpr = getTestE <$> arbitrary genDict x = M.fromList . zip (keys x) <$> (infiniteListOf arbitrary) keys = nub . toList unsafeSubst :: Ord a => Map a b -> E a -> E b unsafeSubst dict = fmap (dict !) eval :: Num a => E a -> a eval = cata $ \case Var' x -> x Add' x y -> x + y Mul' x y -> x * y evalRig :: (E Char -> E Char) -> TestRig Integer -> Integer evalRig f (TestRig d e) = eval (unsafeSubst d (f e)) mkPropEval :: (E Char -> E Char) -> TestRig Integer -> Bool mkPropEval f = (==) <$> evalRig id <*> evalRig f isDistributed :: E a -> Bool isDistributed = para $ \case Add' (_, x) (_, y) -> x && y Mul' (Add _ _, _) _ -> False Mul' _ (Add _ _, _) -> False Mul' (_, x) (_, y) -> x && y _ -> True mkPropDist :: (E Char -> E Char) -> TestE -> Bool mkPropDist f = isDistributed . f . getTestE main = mapM_ test [ ("expandTooClever" , expandTooClever) , ("expandBrute" , expandBrute) , ("expandNotSoBrute", expandNotSoBrute) , ("expandEvert" , expandEvert) , ("expandSOP" , expandSOP) ] where test (header, func) = do putStrLn $ "Testing: " ++ header putStr "Evaluation test: " quickCheck $ mkPropEval func putStr "Distribution test: " quickCheck $ mkPropDist func 替换BT中的外部expandSOP。鉴于正如你所建议的那样乘法NonEmpty,这很有效。

liftA2 (<>)

另一种选择是使用任何其他解决方案,并在单独的步骤中重新关联分布式树中的总和。

expandFlat :: E a -> E a
expandFlat = cata algS . fmap (cata algP) . cata algSOP
    where
    algSOP :: EF a (NonEmpty (BT a)) -> NonEmpty (BT a)
    algSOP = \case
        Var' s -> pure (Leaf s)
        Add' x y -> x <> y
        Mul' x y -> (<>) <$> x <*> y
    algP :: BTF a (E a) -> E a
    algP = \case
        Leaf' s -> Var s
        Branch' x y -> Mul x y
    algS :: NonEmptyF (E a) (E a) -> E a
    algS = \case
        NonEmptyF x Nothing -> x
        NonEmptyF x (Just y) -> Add x y

我们还可以将flattenSum :: E a -> E a flattenSum = cata alg where alg = \case Add' x y -> apo coalg (x, y) x -> embed x coalg = \case (Add x x', y) -> Add' (Left x) (Right (x', y)) (x, y) -> Add' (Left x) (Left y) flattenSum转换为单个函数。注意,当得到分布代数的结果时,和余代数需要一个额外的情况。之所以发生这种情况是因为,当代数从上到下进行时,我们无法确定它生成的子树是否正确关联。

expandEvert

也许有一种更聪明的方式来写-- This is written in a slightly different style than the previous functions. expandDirect :: E a -> E a expandDirect = cata alg where alg = \case Var' s -> Var s Add' x y -> apo coalgAdd (Add x y) Mul' x y -> (apo coalgAdd' . apo coalgMul) (Mul x y) coalgAdd = \case Add (Add x x') y -> Add' (Left x) (Right (Add x' y)) x -> Left <$> project x coalgAdd' = \case Add (Add x x') y -> Add' (Left x) (Right (Add x' y)) Add x (Add y y') -> Add' (Left x) (Right (Add y y')) x -> Left <$> project x coalgMul = \case Mul (Add x x') y -> Add' (Right (Mul x y)) (Right (Mul x' y)) Mul x (Add y y') -> Add' (Right (Mul x y)) (Right (Mul x y')) x -> Left <$> project x ,但我还没想到它。