在管道中跳过第一行 - attoparsec

时间:2014-07-10 10:16:06

标签: haskell attoparsec haskell-pipes

我的类型:

data Test = Test {
 a :: Int,
 b :: Int
} deriving (Show)

我的解析器:

testParser :: Parser Test
testParser = do
  a <- decimal
  tab
  b <- decimal
  return $ Test a b

tab = char '\t'

现在为了跳过第一行,我做了类似的事情:

import qualified System.IO as IO    

parser :: Parser Test
parser = manyTill anyChar endOfLine *> testParser

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
         for (parsed (parser <* endOfLine) (fromHandle testHandle)) (lift . print)

但是上面的parser函数会使每个备用链接跳过(很明显)。如何以与Pipes生态系统一起工作的方式跳过第一行(Producer应该产生一个Test值。)这是一个我不想要的明显解决方案(下面的代码仅在我修改testParser以读取换行符时才会起作用)因为它返回整个[Test]而不是单个值:

tests :: Parser [Test]
tests = manyTill anyChar endOfLine *>
        many1 testParser

有什么想法可以解决这个问题吗?

2 个答案:

答案 0 :(得分:5)

如果第一行没有包含任何有效的Test,您可以使用Either () Test来处理它:

parserEither :: Parser (Either () Test)
parserEither = Right <$> testParser <* endOfLine 
           <|> Left <$> (manyTill anyChar endOfLine *> pure ())

在此之后,您可以使用Pipes.Prelude提供的函数来删除第一个结果(以及所有不可解析的行):

producer p = parsed parserEither p 
         >-> P.drop 1 
         >-> P.filter (either (const False) (const True))
         >-> P.map    (\(Right x) -> x)

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
         for (producer (fromHandle testHandle)) (lift . print)

答案 1 :(得分:5)

您可以在常量空间中有效地删除第一行,如下所示:

import Lens.Family (over)
import Pipes.Group (drops)
import Pipes.ByteString (lines)
import Prelude hiding (lines)

dropLine :: Monad m => Producer ByteString m r -> Producer ByteString m r
dropLine = over lines (drops 1)

您可以在解析dropLine之前将Producer应用于Producer,如下所示:

main = IO.withFile testFile IO.ReadMode $ \testHandle -> runEffect $
    let p = dropLine (fromHandle testHandle)
    for (parsed (parser <* endOfLine) p) (lift . print)