我正在尝试解析文件[1]但是下面的代码无法正常工作。
import Control.Applicative hiding ( many , ( <|> ) )
import Text.Parsec
import Text.Parsec.String (Parser)
import qualified Text.Parsec.Token as T
import Text.Parsec.Language (emptyDef)
--Scheme Code;ISIN Div Payout/ ISIN Growth;ISIN Div Reinvestment;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date
--120523;INF846K01ET8;INF846K01EU6;Axis Triple Advantage Fund - Direct Plan - Dividend Option;13.3660;13.2323;13.3660;19-Jun-2015
data MFund = MFund Integer String String String Double Double Double String deriving (Show) --String String String deriving (Show)
lexer :: T.TokenParser st
lexer = T.makeTokenParser emptyDef
natural :: Parser Integer
natural = T.natural lexer
float :: Parser Double
float = T.float lexer
eol :: Parser String
eol = try (string "\r\n")
<|> try (string "\n")
<|> try (string "\r")
parseFund :: Parsec String () MFund
parseFund = MFund <$> natural
<*> (char ';' *> (many1 alphaNum <|> string "-"))
<*> (char ';' *> (many1 alphaNum <|> string "-"))
<*> (char ';' *> manyTill anyChar (char ';'))
<*> float
<*> (char ';' *> float)
<*> (char ';' *> float)
<*> (char ';' *> many1 (alphaNum <|> char '-'))
parseBlockFund :: Parsec String () [MFund]
parseBlockFund = manyTill anyChar eol *> space *> eol *> endBy parseFund eol
parseMutual :: Parsec String () [MFund]
parseMutual = concat <$> (manyTill anyChar eol *> space *> eol *>
sepBy parseBlockFund (space *> eol))
{- each scheme is seperated by space and end of line-}
parseSchemeBlock :: Parsec String () [MFund]
parseSchemeBlock = concat <$> (sepBy parseMutual (space *> eol))
parseFile :: Parsec String () [MFund]
parseFile =
string "Scheme Code;ISIN Div Payout/ ISIN Growth;ISIN Div Reinvestment;Scheme Name;Net Asset Value;Repurchase Price;Sale Price;Date" *>
eol *> space *> eol *> parseSchemeBlock <* eof
main :: IO ()
main = do
input <- readFile "data.txt"
print input
case parse parseFile "" input of
Left err -> print err
Right val -> print val
我开始从parseFile函数解析文件,该函数消耗第一行,行尾,空格和行尾。现在我解析每个由空格和行尾分隔的方案块。在每个方案块中,我们有许多相互的块,它们也被空格和行尾分开,所以我首先使用字符串直到行尾,然后是空格和行尾,并调用parseBlockFund。
我怀疑的问题是当我调用parseMutual函数时,当它到达方案块的末尾时,它会占用空间和行尾,而这应该被parseSchemeBlock函数占用。我试过尝试功能,但它没有工作,所以我不确定我是否正确思考。有人可以告诉我这段代码有什么问题。
当我删除eof时,一个有趣的事情是,这段代码能够解析至少一个方案块[2]。
[1] http://portal.amfiindia.com/spages/NAV0.txt
[2] https://github.com/mukeshtiwari/Puzzles/blob/master/Assigment/Api.hs