将列表和矢量一起压缩而不重复遍历

时间:2012-02-10 15:14:43

标签: list haskell vector

说我有一个列表和一个向量,我想将它们压缩在一起。一个简单的解决方案是将矢量转换为列表,并将两个列表压缩在一起。但是,这需要两次遍历向量(以及内存分配将其转换为列表) - 一个将向量转换为列表,另一个将其与另一个列表一起压缩。

有没有办法在一次遍历中将两者拼接在一起?我想这需要某种状态保持拉链(我猜它会保持矢量索引的状态,因为它可以在O(1)时间内索引)。

的伪代码:

let l1 = [1..10] :: [CInt]
let v1 = Data.Vector.Storable.fromList l1

map (\(x,y) -> x + y) (zipListVector l1 v1) -- zipListVector function is what I am after

2 个答案:

答案 0 :(得分:7)

Fusion确实在这里开始。

鉴于以下计划:

import Data.Vector
import Prelude hiding (zip)

zipMe :: [a] -> Vector b -> Vector (a, b)
zipMe xs ys = zip (fromList xs) ys

生成以下Core:

Foo.$wzipMe
  :: forall a_auS b_auT.
[a_auS]
-> GHC.Prim.Int#
-> GHC.Prim.Int#
-> GHC.Prim.Array# b_auT
-> Data.Vector.Vector (a_auS, b_auT)
[GblId,
Arity=4,
Str=DmdType LLLL,
Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=4, Value=True,
    ConLike=True, Cheap=True, Expandable=True,
    Guidance=IF_ARGS [0 0 0 0] 302 0}]
Foo.$wzipMe =
  \ (@ a_auS)
(@ b_auT)
(w_sW7 :: [a_auS])
(ww_sWa :: GHC.Prim.Int#)
(ww1_sWb :: GHC.Prim.Int#)
(ww2_sWc :: GHC.Prim.Array# b_auT) ->
GHC.ST.runSTRep
  @ (Data.Vector.Vector (a_auS, b_auT))
  (\ (@ s_aBU) (s_aBV :: GHC.Prim.State# s_aBU) ->
    case GHC.Prim.newArray#
        @ (a_auS, b_auT)
        @ (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))
        ww1_sWb
        (Data.Vector.Mutable.uninitialised @ (a_auS, b_auT))
        (s_aBV
        `cast` (GHC.Prim.State#
              (Sym (Control.Monad.Primitive.TFCo:R:PrimStateST <s_aBU>))
            :: GHC.Prim.State# (Control.Monad.Primitive.R:PrimStateST s_aBU)
              ~
            GHC.Prim.State#
              (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))))
    of _ { (# s'#_aSF, arr#_aSG #) ->
    letrec {
      $s$wa_sX0 [Occ=LoopBreaker]
    :: GHC.Prim.Int#
        -> [a_auS]
        -> GHC.Prim.Int#
        -> GHC.Prim.State# (Control.Monad.Primitive.R:PrimStateST s_aBU)
        -> (# GHC.Prim.State# s_aBU, GHC.Types.Int #)
      [LclId, Arity=4, Str=DmdType LLLL]
      $s$wa_sX0 =
    \ (sc_sWB :: GHC.Prim.Int#)
      (sc1_sWC :: [a_auS])
      (sc2_sWD :: GHC.Prim.Int#)
      (sc3_sWE
          :: GHC.Prim.State#
          (Control.Monad.Primitive.R:PrimStateST s_aBU)) ->
      case sc1_sWC of _ {
        [] -> (# sc3_sWE, GHC.Types.I# sc_sWB #);
        : x_aGx xs1_aGy ->
          case GHC.Prim.indexArray#
              @ b_auT ww2_sWc (GHC.Prim.+# ww_sWa sc2_sWD)
          of _ { (# x1_sWp #) ->
          case GHC.Prim.>=# sc2_sWD ww1_sWb of _ {
        GHC.Types.False ->
          $s$wa_sX0
            (GHC.Prim.+# sc_sWB 1)
            xs1_aGy
            (GHC.Prim.+# sc2_sWD 1)
            ((GHC.Prim.writeArray#
            @ (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))
            @ (a_auS, b_auT)
            arr#_aSG
            sc_sWB
            (x_aGx, x1_sWp)
            (sc3_sWE
              `cast` (GHC.Prim.State#
                    (Sym (Control.Monad.Primitive.TFCo:R:PrimStateST <s_aBU>))
                  :: GHC.Prim.State#
                      (Control.Monad.Primitive.R:PrimStateST s_aBU)
                      ~
                    GHC.Prim.State#
                      (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU)))))
              `cast` (GHC.Prim.State#
                (Control.Monad.Primitive.TFCo:R:PrimStateST <s_aBU>)
                  :: GHC.Prim.State#
                  (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))
                  ~
                GHC.Prim.State#
                  (Control.Monad.Primitive.R:PrimStateST s_aBU)));
        GHC.Types.True -> (# sc3_sWE, GHC.Types.I# sc_sWB #)
          }
          }
      }; } in
    case $s$wa_sX0
        0
        w_sW7
        0
        (s'#_aSF
        `cast` (GHC.Prim.State#
              (Control.Monad.Primitive.TFCo:R:PrimStateST <s_aBU>)
            :: GHC.Prim.State#
              (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))
              ~
            GHC.Prim.State# (Control.Monad.Primitive.R:PrimStateST s_aBU)))
    of _ { (# new_s1_aDv, r1_aDw #) ->
    case r1_aDw of _ { GHC.Types.I# tpl1_aU1 ->
    case GHC.Prim.unsafeFreezeArray#
        @ (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))
        @ (a_auS, b_auT)
        arr#_aSG
        (new_s1_aDv
        `cast` (GHC.Prim.State#
              (Sym (Control.Monad.Primitive.TFCo:R:PrimStateST <s_aBU>))
            :: GHC.Prim.State# (Control.Monad.Primitive.R:PrimStateST s_aBU)
              ~
            GHC.Prim.State#
              (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))))
    of _ { (# s'#1_aV8, arr'#_aV9 #) ->
    (# s'#1_aV8
    `cast` (GHC.Prim.State#
          (Control.Monad.Primitive.TFCo:R:PrimStateST <s_aBU>)
        :: GHC.Prim.State#
            (Control.Monad.Primitive.PrimState (GHC.ST.ST s_aBU))
            ~
          GHC.Prim.State# (Control.Monad.Primitive.R:PrimStateST s_aBU)),
    Data.Vector.Vector @ (a_auS, b_auT) 0 tpl1_aU1 arr'#_aV9 #)
    }
    }
    }
    })

只进行一次分配,并融合这些循环。 (我相信它利用了这样一个事实,即压缩矢量的长度最多是初始Vector的长度,并且最初分配一个较大的矢量。)

答案 1 :(得分:1)

GHC似乎确实在将数据转换为列表时巧妙地优化了遍历向量。给@Louis Wasserman的回答,并在这里添加清除ghc-core - 我的代码与Louis的不同 - 我正在压缩到列表而不是向量(更方便,因为列表很小,并且不会经常生成):

首先是代码:

module Foo where

import Data.Vector
import Prelude

zipMe :: [a] -> Vector b -> [(a,b)]
zipMe xs ys = Prelude.zip xs (toList ys)

如何获得ghc-core(我使用7.4.1):ghc -O -fforce-recomp Foo.hs -ddump-simpl -dsuppress-all

清理下面的ghc核心输出:

--| this is called by zipMeHelper at loop termination
zipMe1 = \ _ -> []

--| Helper function called by zipMe - ys is converted into vector representation (start end array) - that is what I think. Correct me if I got the representation wrong
zipMeHelper =
  \ xs start end array ->
    letrec {
      go =
        \ index ->
          case >=# index end of _ { --| is index >= end? (>=#) is prim version of >=
            False -> --| note vector is being traversed here only once - vector element vecElem is: array at (start + index)
              case indexArray# array (+# start index) of _ { (# vecElem #) ->
              let { --| call a recursive function go2 if end of vector is not reached
                go2 = go (+# index 1) } in
              \ list -> --| take the list element and combine with vecElem
                case list of _ {
                  [] -> [];
                  : x1 xs1 -> : (x1, vecElem) (go2 xs1)
                }
              };
            True -> zipMe1 |-- if here, end of index was reached - terminate with []
          }; } in
    go 0 xs

|-- zipMe function from Foo.hs
zipMe =
  \ xs ys ->
    case ys of _ { Vector start end array ->
    zipMeHelper xs start end array
    }