megaparsec报告解析错误的位置不正确

时间:2018-02-10 19:58:50

标签: parsing haskell megaparsec

对于这个项目,我分两个阶段进行解析。第一阶段处理包括/ ifdef / define指令,并将输入组合成[Span]项,这些项在原始输入中定义它们的起点/终点以及正文。然后,第二阶段将此流解析为我的AST以供后续处理。

AST的每个元素都带有它的源位置,解析后捕获的任何语义错误都会打印正确的错误位置,而不管包含深度。这个部分至关重要,因为它出现在有问题的阶段之后。

问题在第二阶段从包含文件中发出解析错误,它报告了输入中顶级规则位置的虚假错误。初始文件中的解析错误正常。任何指令的存在都会将初始文件划分为多个块,因此它不是一个单块'与多个块相对应'问题。

鉴于AST正确获取位置这一事实我很难知道Megaparsec在遇到解析错误时如何报告错误信息。

我包含了我的流实例和(set | get)(位置|输入)代码,因为它们看起来像是相关的位。我觉得必须要进行一些无法管理的工作,或者我的Stream实例因某些原因无效。

data Span = Span
  { spanStart :: SourcePos
  , spanEnd   :: SourcePos
  , spanBody  :: T.Text
  } deriving (Eq, Ord, Show)

instance Stream [Span] where
  type Token  [Span] = Span
  type Tokens [Span] = [Span]
  tokenToChunk  Proxy = pure
  tokensToChunk Proxy = id
  chunkToTokens Proxy = id
  chunkLength   Proxy = foldl1 (+) . map (T.length . spanBody)
  chunkEmpty    Proxy = all ((== 0) . T.length . spanBody)

  positionAt1 Proxy pos (Span start _ _) = trace ("pos1" ++ show start) start
  positionAtN Proxy pos [] = pos
  positionAtN Proxy _ (Span start _ _:_) = trace ("posN" ++ show start) start

  advance1 Proxy _ _ (Span _ end _) = end
  advanceN Proxy _ pos [] = pos
  advanceN Proxy _ _ ts = let Span _ end _ = last ts in end

  take1_ []     = Nothing
  take1_ s      = case takeN_ 1 s of
                    Nothing -> Nothing
                    Just (sp, s') -> Just (head sp, s')

  takeN_ _ [] = Nothing
  takeN_ n s@(t:ts)
    | s == [] = Nothing
    | n <= 0 = Just ([t {spanEnd = spanStart t, spanBody = ""}], s)
    | n <  (T.length . spanBody) t = let (l, r) = T.splitAt n (spanBody t)
                                         sL = spanStart t
                                         eL = foldl (defaultAdvance1 (mkPos 3)) sL (T.unpack (T.tail l))
                                         sR = defaultAdvance1 (mkPos 3) eL (T.last l)
                                         eR = spanEnd t
                                         l' = [Span sL eL l]
                                         r' = (Span sR eR r):ts
                                     in Just (trace (show n) l', r')
    | n == (T.length . spanBody) t = Just ([t], ts)
    | otherwise = case takeN_ (n - T.length (spanBody t)) ts of
                     Nothing -> Just ([t], [])
                     Just (t', ts') -> Just (t:t', ts')


  takeWhile_ p s = fromJust $ takeN_ (go 0 s) s
    where go n s = case take1_ s of
                      Nothing -> n
                      Just (c, s') -> if p c
                                      then go (n + 1) s'
                                      else n

查找包含并交换到它:

"include" -> do
     file <- between dquote dquote (many (alphaNumChar <|> char '.' <|> char '/' <|> char '_'))
     s    <- liftIO (Data.Text.IO.readFile file)
     p    <- getPosition
     i    <- getInput
     pushPosition p
     stack %= (:) (p, i)
     setPosition (initialPos file)
     setInput s

如果我们到达输入弹出堆栈的末尾并继续:

parseStream' :: StreamParser [Span]
parseStream' = concat <$> many p
   where p = do
          b <- tick <|> block
          end <- option False (True <$ hidden eof)
          h <- use stack
          when (end && (h /= [])) $ do
            popPosition
            setInput (h ^?! ix 0 . _2)
            stack %= tail
          return b

0 个答案:

没有答案