Question

如何改进以下滚动总和实施？

type Buffer  = State BufferState (Maybe Double)
type BufferState = ( [Double] , Int, Int )

-- circular buffer    
buff :: Double -> Buffer 
buff newVal = do
  ( list, ptr, len) <- get
  -- if the list is not full yet just accumulate the new value
  if length list < len
    then do
      put ( newVal : list , ptr, len)
      return Nothing
    else do
      let nptr = (ptr - 1) `mod` len
          (as,(v:bs)) = splitAt ptr list
          nlist = as ++ (newVal : bs)
      put (nlist, nptr, len)
      return $ Just v

-- create intial state for circular buffer
initBuff l = ( [] , l-1 , l)

-- use the circular buffer to calculate a rolling sum
rollSum :: Double -> State (Double,BufferState) (Maybe Double)
rollSum newVal = do
  (acc,bState) <- get
  let (lv , bState' )  = runState (buff newVal) bState
      acc' = acc + newVal
  -- subtract the old value if the circular buffer is full
  case lv of
    Just x  -> put ( acc' - x , bState') >> (return $ Just (acc' - x))
    Nothing -> put ( acc' , bState')     >> return Nothing

test :: (Double,BufferState) -> [Double] -> [Maybe Double] -> [Maybe Double]
test state [] acc = acc
test state (x:xs) acc = 
  let (a,s) = runState (rollSum x) state
  in test s xs (a:acc)

main :: IO()
main = print $ test (0,initBuff 3) [1,1,1,2,2,0] []

Buffer使用State monad实现循环缓冲区。 rollSum再次使用State monad来跟踪滚动和值和循环缓冲区的状态。

我怎么能让这更优雅？
我想实现其他功能，如滚动平均值或差异，我该怎么做才能让这个变得容易？

谢谢！

修改

我忘了提到我正在使用循环缓冲区，因为我打算在线使用此代码并在到达时处理更新 - 因此需要记录状态。像

这样的东西

newRollingSum = update rollingSum newValue

Answer 1

我还没有设法破译你的所有代码，但这是我将采取的解决这个问题的计划。首先，该计划的英文描述：

我们需要从每个索引开始的长度为n的列表中的窗口。
1. 制作任意长度的窗户。
2. 将长窗口截断为n。{/ li>
3. 删除最后n-1个，这将太短。
对于每个窗口，请添加条目。

这是我的第一个想法;对于长度为3的窗口，这是一个很好的方法，因为步骤2在这么短的列表上很便宜。对于较长的窗口，您可能需要一种替代方法，我将在下面讨论;但是这种方法的好处是可以顺利地推广到sum以外的函数。代码可能如下所示：

import Data.List

rollingSums n xs
    = map sum                              -- add up the entries
    . zipWith (flip const) (drop (n-1) xs) -- drop the last n-1
    . map (take n)                         -- truncate long windows
    . tails                                -- make arbitrarily long windows
    $ xs

如果您熟悉优化的“等式推理”方法，您可能会首先发现我们可以提高此功能的性能：通过交换第一个map和zipWith，我们可以生成具有相同行为但具有map f . map g子项的函数，该子函数可以由map (f . g)替换，以获得稍微减少的分配。

不幸的是，对于大型n，这会在内部循环中添加n个数字;我们宁愿简单地在窗口的“前面”添加值，并在“后面”减去一个值。所以我们需要变得更加棘手。这是一个新想法：我们将并行地两次遍历列表，n位置分开。然后我们将使用一个简单的函数来获取列表前缀（即scanl (+)）的滚动总和（无限制窗口长度），以将此遍历转换为我们感兴趣的实际总和。

rollingSumsEfficient n xs = scanl (+) firstSum deltas where
    firstSum = sum (take n xs)
    deltas   = zipWith (-) (drop n xs) xs -- front - back

有一个转折，即scanl永远不会返回空列表。因此，如果您能够处理短列表很重要，那么您将需要另一个检查这些列表的等式。不要使用length，因为这会在开始计算之前强制将整个输入列表放入内存中 - 这可能是致命的性能错误。而是在上一个定义上面添加如下所示的行：

rollingSumsEfficient n xs | null (drop (n-1) xs) = []

我们可以在ghci中尝试这两个。您会注意到他们完全与您的行为不相同：

*Main> rollingSums 3 [10^n | n <- [0..5]]
[111,1110,11100,111000]
*Main> rollingSumsEfficient 3 [10^n | n <- [0..5]]
[111,1110,11100,111000]

另一方面，这些实现更加简洁，并且在无限列表中工作时完全是懒惰的：

*Main> take 5 . rollingSums 10 $ [1..]
[55,65,75,85,95]
*Main> take 5 . rollingSumsEfficient 10 $ [1..]
[55,65,75,85,95]

如何改善滚动总和实施？

1 个答案: