Question

有一次，我在haskell中编写了一个数据包捕获程序，并使用惰性IO来捕获所有tcp数据包。问题是有时数据包乱序，所以我必须将它们全部插入一个列表，直到我得到一个fin标志，以确保我有所需的所有数据包与它们做任何事情，如果我在嗅东西真的很大，就像一个视频，我不得不把所有这些都记在内存中。任何其他方式都需要一些困难的命令性代码。

所以后来我学习了iteratees，我决定实现自己的。它是如何工作的，有一个枚举。您为它提供了您希望它保留的数据包数。当它拉入数据包时，它对它们进行排序，然后一旦它达到你指定的数字，它就会开始刷新，但在那里留下一些，以便在刷新更多数据包之前将新的块分类到该列表中。我们的想法是，在它们击中这个枚举之前，块几乎是有序的，它将解决大多数小订单问题。当它获得EOF时，它应该将所有剩余的数据包发回。

所以它几乎可以工作。我意识到其中一些可以被标准枚举器函数取代，但我想自己编写它们以了解它如何更好地工作。这是一些代码：

Readlines只是一次从一个文件中获取行并提供它。 PrintLines只打印每个块。 numbers.txt是一行划界的数字，略有乱序，有些数字应该是之前或之后的几个空格。重新排序是一个函数，它保存n个数字并将新的数字排序到它的累加器列表中，然后推出除了这些数字的最后n个之外的所有数字。

import Prelude as P
import Data.Enumerator as E
import Data.Enumerator.List as EL
import Data.List (sort, insert)
import IO
import Control.Monad.Trans (lift)
import Control.Monad (liftM)

import Control.Exception as Exc
import Debug.Trace

test = run_ (readLines "numbers.txt" $$ EL.map (read ::String -> Int) =$ reorder 10 =$ printLines)

reorder :: (Show a, Ord a) => (Monad m) => Int -> Enumeratee a a m b
reorder n step = reorder' [] n step
  where
    reorder' acc n (Continue k) =
      let
        len = P.length
        loop buf n' (Chunks xs)
          | (n' - len xs >= 0) = continue (loop (foldr insert buf xs) (n' - len xs))
          | otherwise  =
              let allchunx = foldr insert buf xs
                  (excess,store)= P.splitAt (negate (n' - len xs)) allchunx
              in k (Chunks excess) >>== reorder' store 0
        loop buf n' (EOF) = k (Chunks (trace ("buf:" ++ show buf) buf)) >>== undefined
      in continue (loop acc n)

printLines :: (Show a) => Iteratee a IO ()
printLines = continue loop
  where
    loop (Chunks []) = printLines
    loop (Chunks (x:xs)) = do
      lift $ print x
      printLines
    loop (EOF) = yield () EOF

readLines :: FilePath -> Enumerator String IO ()
readLines filename s = do
  h <- tryIO $ openFile filename ReadMode
  Iteratee (Exc.finally (runIteratee $ checkContinue0 (blah h) s) (hClose h))
  where
    blah h loop k = do
      x <- lift $ myGetLine h
      case x of
        Nothing -> continue k
        Just line -> k (Chunks [line]) >>== loop
    myGetLine h = Exc.catch (liftM Just (hGetLine h)) checkError
    checkError :: IOException -> IO (Maybe String)
    checkError e = return Nothing

我的问题是在重新排序时未定义。重新排序有10个项目卡在其中，然后它从堆栈中收到一个EOF。所以它是k（Chunks那些10项）然后有一个未定义因为我不知道该放在这里使它工作。

最后10个项目会从程序输出中删除。您可以看到跟踪，变量buf包含其中的所有剩余项目。我试过屈服，但我不确定要屈服什么，或者我应该屈服。我不知道该怎么做才能完成这项工作。

编辑：通过将循环的未定义部分更改为：

来修复重新排序

loop buf n' EOF = k (Chunks buf) >>== (\s -> yield s EOF)

我几乎肯定曾经有过一次，但我没有得到正确答案所以我认为这是错误的。

问题在于printLines。由于重新排序是一次发送一个块直到它到达最后，我从来没有注意到printLines的问题是它丢弃了除每个循环的第一个块之外的块。在我的脑海中，我以为那些块会带来什么东西，这是愚蠢的。

无论如何我将printLines更改为：

printLines :: (Show a) => Iteratee a IO ()
printLines = continue loop
  where
    loop (Chunks []) = printLines
    loop (Chunks xs) = do
      lift $ mapM_ print xs
      printLines
    loop (EOF) = yield () EOF

现在它有效。非常感谢，我担心我不会得到答案。

Answer 1

怎么样

loop buf n' (EOF) = k (Chunks buf) >>== (\s -> yield s EOF)

（来自EB.isolate的想法）。

根据您的具体操作，您的printLines可能还需要修复; Chunks（x：xs）的情况会抛弃xs。像

这样的东西

loop (Chunks (x:xs)) = do
  lift $ print x
  loop (Chunks xs)

可能（或可能不）是您的意图。

无法完成这个枚举

1 个答案: