在Haskell中生成列表的唯一组合的函数

时间:2018-10-02 05:26:31

标签: haskell combinations

是否有Haskell函数可从列表中生成给定长度的所有唯一组合?

Source = [1,2,3]

uniqueCombos 2 Source = [[1,2],[1,3],[2,3]]

我尝试在Hoogle中查找,但是找不到专门用于此功能的函数。排列不能提供理想的结果。

以前有人使用过类似的功能吗?

5 个答案:

答案 0 :(得分:6)

我也不知道预定义的函数,但是编写自己的代码很容易:

-- Every set contains a unique empty subset.
subsets 0 _ = [[]]

-- Empty sets don't have any (non-empty) subsets.
subsets _ [] = []

-- Otherwise we're dealing with non-empty subsets of a non-empty set.
-- If the first element of the set is x, we can get subsets of size n by either:
--   - getting subsets of size n-1 of the remaining set xs and adding x to each of them
--     (those are all subsets containing x), or
--   - getting subsets of size n of the remaining set xs
--     (those are all subsets not containing x)
subsets n (x : xs) = map (x :) (subsets (n - 1) xs) ++ subsets n xs

答案 1 :(得分:4)

使用Data.List

import Data.List
combinations k ns = filter ((k==).length) $ subsequences ns

参考:99 Haskell Problems

参考文献中有很多有趣的解决方案,我只是选择了一种简洁的方法。

答案 2 :(得分:2)

lib中没有这样的操作,但是您可以自己轻松实现它:

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/flexslider/2.7.1/jquery.flexslider.js"></script>
<link href="https://cdnjs.cloudflare.com/ajax/libs/flexslider/2.7.1/flexslider.css" rel="stylesheet" />
<div class="slider-wrap">
  <div id="main-slider" class="flexslider">
    <ul class="slides">
      <li>
        <img src="https://via.placeholder.com/350x150" />
      </li>
      <li>
        <img src="https://via.placeholder.com/350x150" />
      </li>
      <li>
        <img src="https://via.placeholder.com/350x150" />
      </li>
      <li>
        <img src="https://via.placeholder.com/350x150" />
      </li>
    </ul>
  </div>

  <div id="secondary-slider" class="flexslider">
    <ul class="slides">
      <li>
        <p>Text 1</p>
      </li>
      <li>
        <p>Text 2</p>
      </li>
      <li>
        <p>Text 3</p>
      </li>
      <li>
        <p>Text 4</p>
      </li>
    </ul>
  </div>
  <a rel="0" class="slide_thumb" href="#">slide link 1</a>
  <a rel="1" class="slide_thumb" href="#">slide link 2</a>
  <a rel="2" class="slide_thumb" href="#">slide link 3</a>
  <a rel="3" class="slide_thumb" href="#">slide link 3</a>

答案 3 :(得分:1)

我不清楚您对性能有多关注。

如果有什么用,可以回溯到2014年,有人发布了performance contest的各种Haskell组合生成算法。

对于26个项目中的13个项目的组合,执行时间从3到167秒不等!最快的条目由Bergi提供。这是非显而易见的(至少对我来说)源代码:

subsequencesOfSize :: Int -> [a] -> [[a]]
subsequencesOfSize n xs = let l = length xs
                          in if (n > l) then []
                             else subsequencesBySize xs !! (l-n)
 where
   subsequencesBySize [] = [[[]]]
   subsequencesBySize (x:xs) = let next = subsequencesBySize xs
                               in zipWith (++)
                                    ([]:next)
                                    ( map (map (x:)) next ++ [[]] )                 

最近,问题是revisited,具体情况是从大列表中选择几个元素(100个中的5个)。在这种情况下,您不能使用subsequences [1 .. 100]之类的东西,因为它是指长度为2 100 ≃1.26 * 10 30 的列表。我提交了一个基于algorithm的状态机,它不是我想要的Haskell惯用语,但是在这种情况下相当有效,每个输出项大约30个时钟周期。

旁注:使用 multisets 生成组合吗?

此外,还有一个Math.Combinatorics.Multiset软件包。这是documentation。我仅对其进行了简要测试,但是可以用于生成组合。

例如,8个元素中的3个元素的所有组合的集合就像multiset的“排列”,其中两个元素(存在和不存在)分别具有分别为3和(8-3)的倍数= 5。

让我们用这个想法来生成8个元素中的3个元素的所有组合。它们共有(8 * 7 * 6)/(3 * 2 * 1)= 336/6 = 56。

*L M Mb T MS> import qualified Math.Combinatorics.Multiset as MS
*Math.Combinatorics.Multiset L M Mb T MS> pms = MS.permutations
*Math.Combinatorics.Multiset L M Mb T MS> :set prompt "λ> "
λ> 
λ> pms38 = pms $ MS.fromCounts [(True, 3), (False,5)]
λ> 
λ> length pms38
56
λ>
λ> take 3 pms38
[[True,True,True,False,False,False,False,False],[True,True,False,False,False,False,False,True],[True,True,False,False,False,False,True,False]]
λ> 
λ> str = "ABCDEFGH"
λ> combis38 = L.map fn pms38 where fn mask = L.map fst $ L.filter snd (zip str mask)
λ> 
λ> sort combis38
["ABC","ABD","ABE","ABF","ABG","ABH","ACD","ACE","ACF","ACG","ACH","ADE","ADF","ADG","ADH","AEF","AEG","AEH","AFG","AFH","AGH","BCD","BCE","BCF","BCG","BCH","BDE","BDF","BDG","BDH","BEF","BEG","BEH","BFG","BFH","BGH","CDE","CDF","CDG","CDH","CEF","CEG","CEH","CFG","CFH","CGH","DEF","DEG","DEH","DFG","DFH","DGH","EFG","EFH","EGH","FGH"]
λ>
56
λ>

至少从功能上讲,使用多集生成组合的想法行得通。

答案 4 :(得分:1)

@melpomene的回答很简单。这可能是您在互联网上许多需要combinationsOf功能的地方看到的。

无论如何,它隐藏在双重递归的背后,却可以避免大量不必要的递归调用,从而产生了高效得多的代码。也就是说,如果列表的长度短于k,则我们无需拨打电话。

我会提出双重终止检查。

combinationsOf :: Int -> [a] -> [[a]]
combinationsOf k xs = runner n k xs
                      where
                      n = length xs
                      runner :: Int -> Int -> [a] -> [[a]]
                      runner n' k' xs'@(y:ys) = if k' < n'      -- k' < length of the list
                                                then if k' == 1
                                                     then map pure xs'
                                                     else map (y:) (runner (n'-1) (k'-1) ys) ++ runner (n'-1) k' ys
                                                else pure xs'   -- k' == length of the list.

λ> length $ subsets 10 [0..19] -- taken from https://stackoverflow.com/a/52602906/4543207
184756
(1.32 secs, 615,926,240 bytes)

λ> length $ combinationsOf 10 [0..19]
184756
(0.45 secs, 326,960,528 bytes)

因此,上面的代码虽然进行了尽可能的优化,但是仍然由于内部的两次递归而仍然效率低下。根据经验,在任何算法中,都最好避免重复递归,或在非常仔细的检查下予以考虑。

另一方面,以下算法是在速度和内存消耗上都非常有效的方法。

combinationsOf :: Int -> [a] -> [[a]]
combinationsOf 1 as        = map pure as
combinationsOf k as@(x:xs) = run (l-1) (k-1) as $ combinationsOf (k-1) xs
                             where
                             l = length as

                             run :: Int -> Int -> [a] -> [[a]] -> [[a]]
                             run n k ys cs | n == k    = map (ys ++) cs
                                           | otherwise = map (q:) cs ++ run (n-1) k qs (drop dc cs)
                                           where
                                           (q:qs) = take (n-k+1) ys
                                           dc     = product [(n-k+1)..(n-1)] `div` product [1..(k-1)]

λ> length $ combinationsOf 10 [0..19]
184756
(0.09 secs, 51,126,448 bytes)