使用Parsec3

时间:2016-05-23 07:53:10

标签: string haskell text parsec

我看到Parsec3处理Text(不是String)输入,所以我想转换一个旧的String解析器来获取Text输出。我使用的其他库也使用Text,这样可以减少所需的转化次数。

现在,parsec3库似乎按照它的说法执行(处理TextString输入),此示例来自gchi:

Text.Parsec.Text Text.Parsec Data.Text> parseTest (many1 $  char 's') (pack "sss")
"sss"
Text.Parsec.Text Text.Parsec Data.Text> parseTest (many1 $  char 's') "sss"
"sss"

因此,Text(第一种情况)和String(第二种情况)都有效。

现在,在我真实的,转换后的解析器中(抱歉,我必须在这里拼凑一些代码的远程部分以作一个完整的例子)

{-# LANGUAGE OverloadedStrings #-}
data UmeQueryPart = MidQuery Text Text MatchType

data MatchType = Strict | Fuzzy deriving Show

funcMT :: Text -> MatchType
funcMT mt = case mt of
        "~" -> Fuzzy
        _ -> Strict

midOfQuery :: Parser UmeQueryPart
midOfQuery = do
    spaces
    string "MidOf"
    spaces
    char '('
    spaces
    clabeltype <- many1 alphaNum
    spaces
    sep <- try( char ',') <|> char '~'
    spaces
    plabeltype <- many1 alphaNum
    spaces
    char ')'
    spaces
    return $ MidQuery (pack plabeltype) (pack clabeltype) (funcMT sep)

对于funcMT调用,我发现自己有很多这样的错误

UmeQueryParser.hs:456:96:
    Couldn't match type ‘[Char]’ with ‘Text’
    Expected type: Text
      Actual type: String
    In the first argument of ‘funcMT’, namely ‘sep’
    In the fifth argument of ‘ midOfQuery’, namely ‘(funcMT sep)’

如果我没有明确pack上面代码示例中的捕获文本,那么:

UmeQueryParser.hs:288:26:
    Couldn't match expected type ‘Text’ with actual type ‘[Char]’
    In the first argument of ‘ midOfQuery’, namely ‘(plabeltype)’
    In the second argument of ‘($)’, namely
      ‘StartQuery (plabeltype) (clabeltype) (funcMT sep)’ 

因此,似乎我需要在输出中将捕获的字符串显式转换为Text

那么,为什么我需要完成从StringChar转换为Text的步骤,以便进行Text -> Text解析?

1 个答案:

答案 0 :(得分:1)

你可以制作自己的Text解析器,简单就像

midOfQuery :: Parser UmeQueryPart
midOfQuery = do
    spaces
    lexeme $ string "MidOf"
    lexeme $ char '('
    clabeltype <- lexeme alphaNums
    sep <- lexeme $ try (char ',') <|> char '~'
    plabeltype <- lexeme alphaNums
    lexeme $ char ')'
    return $ MidQuery plabeltype clabeltype (funcMT sep)
  where
    alphaNums = pack <$> many1 alphaNum
    lexeme p = p <* spaces

或者稍微更紧凑(但我认为更可读):

midOfQuery :: Parser UmeQueryPart
midOfQuery = spaces *> lexeme (string "MidOf") *> parens (toQuery <$> lexeme alphaNums <*> lexeme matchType <*> lexeme alphaNums)
  where
    lexeme :: Parser a -> Parser a
    lexeme p = p <* spaces

    alphaNums = pack <$> many1 alphaNum

    parens = between (lexeme $ char '(') (lexeme $ char ')')

    matchType = Fuzzy <$ char '~' <|>
                Strict <$ char ','

    toQuery cLabelType sep pLabelType = MidQuery pLabelType cLabelType sep