如何用Attoparsec解析雅虎历史csv

时间:2014-04-22 08:07:55

标签: haskell attoparsec

我是haskell的初学者,如何用attoparsec解析为开放数组,高数组等

module CsvParser (
      Quote (..)
    , csvFile
    , quote
    ) where
import System.IO
import Data.Attoparsec.Text
import Data.Attoparsec.Combinator
import Data.Text (Text, unpack)
import Data.Time
import System.Locale
import Data.Maybe

data Quote = Quote {
        qTime       :: LocalTime,
        qAsk        :: Double,
        qBid        :: Double,
        qAskVolume  :: Double,
        qBidVolume  :: Double
    } deriving (Show, Eq)

csvFile :: Parser [Quote]
csvFile = do
    q <- many1 quote
    endOfInput
    return q

quote   :: Parser Quote
quote   = do
    time        <- qtime
    qcomma
    ask         <- double
    qcomma
    bid         <- double
    qcomma
    askVolume   <- double
    qcomma
    bidVolume   <- double
    endOfLine
    return $ Quote time ask bid askVolume bidVolume 

qcomma  :: Parser ()
qcomma  = do 
    char ','
    return ()

qtime   :: Parser LocalTime
qtime   = do
    tstring     <- takeTill (\x -> x == ',')
    let time    = parseTime defaultTimeLocale "%d.%m.%Y %H:%M:%S%Q" (unpack tstring)
    return $ fromMaybe (LocalTime (fromGregorian 0001 01 01) (TimeOfDay 00 00 00 )) time

--testString :: Text
--testString = "01.10.2012 00:00:00.741,1.28082,1.28077,1500000.00,1500000.00\n" 

quoteParser = parseOnly quote

main = do  
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode  
    contents <- hGetContents handle  
    let allLines = lines contents
    map (\line -> quoteParser line) allLines
    --putStr contents  
    hClose handle

错误讯息:

testhaskell.hs:89:5:
    Couldn't match type `[]' with `IO'
    Expected type: IO (Either String Quote)
      Actual type: [Either String Quote]
    In the return type of a call of `map'
    In a stmt of a 'do' block:
      map (\ line -> quoteParser line) allLines
    In the expression:
      do { handle <- openFile
                       "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode;

           contents <- hGetContents handle;
           let allLines = lines contents;
           map (\ line -> quoteParser line) allLines;
           .... }

testhaskell.hs:89:37:
    Couldn't match type `[Char]' with `Text'
    Expected type: [Text]
      Actual type: [String]
    In the second argument of `map', namely `allLines'
    In a stmt of a 'do' block:
      map (\ line -> quoteParser line) allLines
    In the expression:
      do { handle <- openFile
                       "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode;

           contents <- hGetContents handle;
           let allLines = lines contents;
           map (\ line -> quoteParser line) allLines;
           .... }

2 个答案:

答案 0 :(得分:2)

该错误与parsec或attoparsec无关。错误消息指向的行不是IO操作,因此当您尝试将其用作一个时会导致错误:

main = do  
    handle <- openFile "C:\\Users\\ivan\\Downloads\\0005.HK.csv" ReadMode  
    contents <- hGetContents handle  
    let allLines = lines contents
    map (\line -> quoteParser line) allLines   -- <== This is not an IO action
    --putStr contents  
    hClose handl

您忽略了map来电的结果。您应该将其存储在let的变量中,就像使用lines的结果一样。

第二个错误是因为您尝试将Text用作String作为不同类型的pack,即使它们都代表有序的字符集合(它们也有不同的内部表示形式)。您可以使用unpackmainhttp://hackage.haskell.org/package/text/docs/Data-Text.html#g:5

在两种类型之间进行转换

此外,您应始终明确指定main :: IO ()类型签名{{1}}。如果你不这样做,它有时会导致细微的问题。

正如其他人所说,你应该使用csv解析器包。

答案 1 :(得分:0)

您可以使用attoparsec-csv包,也可以查看其source code,了解如何自行编写。

代码就像

import qualified Data.Text.IO as T
import Text.ParseCSV

main = do
  txt <- T.readFile "file.csv"
  case parseCSV txt of
    Left  err -> error err
    Right csv -> mapM_ (print . mkQuote) csv

mkQuote :: [T.Text] -> Quote
mkQuote = error "Not implemented yet"