嵌套序列到分支/树数据结构

时间:2013-07-26 22:09:54

标签: haskell clojure f# ocaml

我不确定这是否是一个容易解决的问题,而我只是遗漏了一些明显的东西,但是我已经对它进行了一段时间的反击。我试图用列表表达树分歧。这样我就可以使用简单的原语轻松地内联指定我的数据集,而不用担心顺序,并在以后从一组不同的列表中构建树。

所以我有一些这样的列表:

 a = ["foo", "bar", "qux"]
 b = ["foo", "bar", "baz"]
 c = ["qux", "bar", "qux"]

我想有一个函数,它将采用这些列表的序列并表达如下的树:

myfunc :: [[a]] -> MyTree a

(root) -> foo -> bar -> [baz, qux]
       -> qux -> bar -> qux

理想的解决方案是能够采用不同长度的序列,即:

a = ["foo"; "bar"; "qux"]
b = ["foo"; "bar"; "baz"; "quux"]
== 
(root) -> foo -> bar -> [qux, baz -> quux]

是否有任何教科书示例或算法可以帮助我解决这个问题?看起来它可以优雅地解决,但我对它的所有刺痛看起来都非常可怕!

请随意发布任何功能语言的解决方案,我会酌情翻译。

谢谢!

4 个答案:

答案 0 :(得分:5)

我解决此问题的方法是使用Forest来表示您的类型,然后制作Forest一个Monoid,其中mappend两个Forest一起加入他们共同的祖先。剩下的就是提出一个合适的Show实例:

import Data.List (sort, groupBy)
import Data.Ord (comparing)
import Data.Foldable (foldMap)
import Data.Function (on)
import Data.Monoid

data Tree a = Node
    { value :: a
    , children :: Forest a
    } deriving (Eq, Ord)

instance (Show a) => Show (Tree a) where
    show (Node a f@(Forest ts0)) = case ts0 of
        []  -> show a
        [t] -> show a ++ " -> " ++ show t
        _   -> show a ++ " -> " ++ show f

data Forest a = Forest [Tree a] deriving (Eq, Ord)

instance (Show a) => Show (Forest a) where
    show (Forest ts0) = case ts0 of
        []  -> "[]"
        [t] -> show t
        ts  -> show ts

instance (Ord a) => Monoid (Forest a) where
    mempty = Forest []
    mappend (Forest tsL) (Forest tsR) =
          Forest
        . map (\ts -> Node (value $ head ts) (foldMap children ts))
        . groupBy ((==) `on` value)
        . sort
        $ tsL ++ tsR

fromList :: [a] -> Forest a
fromList = foldr cons nil
  where
    cons a as = Forest [Node a as]
    nil = Forest []

以下是一些示例用法:

>>> let a = fromList ["foo", "bar", "qux"]
>>> let b = fromList ["foo", "bar", "baz", "quux"]
>>> a
"foo" -> "bar" -> "qux"
>>> b
"foo" -> "bar" -> "baz" -> "quux"
>>> a <> b
"foo" -> "bar" -> ["baz" -> "quux","qux"]
>>> a <> a
"foo" -> "bar" -> "qux"

所以你的myFunc会变成:

myFunc :: [[a]] -> Forest a
myFunc = foldMap fromList

答案 1 :(得分:3)

我提出了一个与Gabriel非常相似的解决方案,但我的数据表示使用了Map,因此我可以将大部分工作加载到Data.Map.unionWith

import Data.Map (Map, empty, singleton, unionWith, assocs)
import Data.Monoid

type Path a = [a]
data Tree a = Tree {leaf :: Bool, childs :: Map a (Tree a)} deriving Show

树中的布尔标志标记此节点是否可以是路径的末尾。 a值隐藏在childs地图中。为了预热,让我们定义如何将单个路径转换为树。

root :: Tree a
root = Tree True empty

cons :: a -> Tree a -> Tree a
cons node tree = Tree False (singleton node tree)

follow :: Path a -> Tree a
follow = foldr cons root

{Gaberiel的代码中follow函数称为fromList。我们还可以枚举树中包含的所有路径。

paths :: Tree a -> [Path a]
paths (Tree leaf childs) =
  (if leaf then [[]] else []) ++
  [ node : path | (node, tree) <- assocs childs, path <- paths tree ]

这些问题基本上要求这个paths函数的反函数。使用unionWith,我们可以轻松定义树的幺半群结构。

instance Ord a => Monoid (Tree a) where
  mempty = Tree False empty
  mappend (Tree leaf1 childs1) (Tree leaf2 childs2) = Tree leaf childs where
    leaf = leaf1 || leaf2
    childs = unionWith mappend childs1 childs2

现在要将路径列表转换为树,我们只使用mconcatfollow

unpaths :: Ord a => [Path a] -> Tree a
unpaths = mconcat . map follow

以下是使用问题路径的测试用例。

a, b, c, d :: Path String

a = ["foo", "bar", "qux"]
b = ["foo", "bar", "baz"]
c = ["qux", "bar", "qux"]
d = ["foo", "bar", "baz", "quux"]

-- test is True
test = (paths . unpaths) [a, b, c, d] == [b, d, a, c]

我们获得了存储在树中的相同路径,但是作为有序列表。

答案 2 :(得分:1)

type TreeNode<'T> = 
  | Node of 'T * Tree<'T>
and Tree<'T> = TreeNode<'T> list

module Tree =
  let rec ofList = function
    | [] -> []
    | x::xs -> [Node(x, ofList xs)]

  let rec merge xs tree =
    match (tree, xs) with
    | _, [] -> tree
    | [], _ -> ofList xs
    | nodes, x::xs ->
      let matching, nonMatching = nodes |> List.partition (fun (Node(y, _)) -> y = x)
      match matching with
      | [Node(_, subtree)] -> Node(x, merge xs subtree) :: nonMatching
      | _ -> Node(x, ofList xs)::nodes

Tree.ofList ["foo"; "bar"; "qux"]
|> Tree.merge ["foo"; "bar"; "baz"]
|> Tree.merge ["qux"; "bar"; "qux"]

> val it : TreeNode<string> list =
  [Node ("qux",[Node ("bar",[Node ("qux",[])])]);
   Node ("foo",[Node ("bar",[Node ("baz",[]); Node ("qux",[])])])]

答案 3 :(得分:0)

使用散列图的clojure版本:

(defn merge-to-tree
  [& vecs]
  (let [layer (group-by first vecs)]
    (into {} (map (fn [[k v]]
                    (when k
                      [k (apply merge-to-tree (map rest v))]))
                  layer))))

这里我使用group-by来查看多个向量元素何时应该由输出结构中的单个项表示。 (into {} (map (fn [[k v]] ...) m))是用于解构散列条目,执行某些操作,然后从结果重构散列的标准习惯用语。对值(apply merge-to-tree (map rest v))的递归调用构造树结构层下面的各个分支(映射休息,因为完整输入由group-by保留,第一个元素已用作查找键)

我欢迎其他建议/改进。用法示例:

user> (merge-to-tree ["foo" "bar" "qux"])
{"foo" {"bar" {"qux" {}}}}

user> (merge-to-tree ["foo" "bar" "qux"] ["foo" "bar" "baz"] ["qux" "bar" "qux"])
{"foo" {"bar" {"qux" {}, "baz" {}}}, "qux" {"bar" {"qux" {}}}}

user> (merge-to-tree ["foo" "bar" "qux"] ["foo" "bar" "baz" "quux"])
{"foo" {"bar" {"qux" {}, "baz" {"quux" {}}}}}