如何设置 cassava 以忽略缺失的列/字段并使用默认值填充相应的数据类型?考虑这个例子:
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
import Data.ByteString.Lazy.Char8
import Data.Csv
import Data.Vector
import GHC.Generics
data Foo = Foo {
a :: String
, b :: Int
} deriving (Eq, Show, Generic)
instance FromNamedRecord Foo
decodeAndPrint :: ByteString -> IO ()
decodeAndPrint csv = do
print $ (decodeByName csv :: Either String (Header, Vector Foo))
main :: IO ()
main = do
decodeAndPrint "a,b,ignore\nhu,1,pu" -- [1]
decodeAndPrint "ignore,b,a\npu,1,hu" -- [2]
decodeAndPrint "ignore,b\npu,1" -- [3]
[1]
和 [2]
工作得很好,但 [3]
失败了
Left "parse error (Failed reading: conversion error: no field named \"a\") at \"\""
我怎样才能让 decodeAndPrint
能够处理这个不完整的输入?
我当然可以操作输入的字节串,但也许有更优雅的解决方案。
一个解决方案,感谢以下 Daniel Wagner 的意见:
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE OverloadedStrings #-}
import Control.Applicative
import Data.ByteString.Lazy.Char8
import Data.Csv
import Data.Vector
import GHC.Generics
data Foo = Foo {
a :: Maybe String
, b :: Maybe Int
} deriving (Eq, Show, Generic)
instance FromNamedRecord Foo where
parseNamedRecord rec = pure Foo
<*> ((Just <$> Data.Csv.lookup rec "a") <|> pure Nothing)
<*> ((Just <$> Data.Csv.lookup rec "b") <|> pure Nothing)
decodeAndPrint :: ByteString -> IO ()
decodeAndPrint csv = do
print $ (decodeByName csv :: Either String (Header, Vector Foo))
main :: IO ()
main = do
decodeAndPrint "a,b,ignore\nhu,1,pu" -- [1]
decodeAndPrint "ignore,b,a\npu,1,hu" -- [2]
decodeAndPrint "ignore,b\npu,1" -- [3]
答案 0 :(得分:2)
(警告:完全未经测试!代码仅供思想传播,不适合任何用途等)
Parser
要求的 FromNamedRecord
类型是 Alternative
,因此只需使用 (<|>)
设置默认值即可。
instance FromNamedRecord Foo where
parseNamedRecord rec = pure Foo
<*> (lookup rec "a" <|> pure "missing")
<*> (lookup rec "b" <|> pure 0)
如果您想稍后知道该字段是否存在,请使您的字段足够丰富以记录该字段:
data RichFoo = RichFoo
{ a :: Maybe String
, b :: Maybe Int
}
instance FromNamedRecord Foo where
parseNamedRecord rec = pure RichFoo
<*> ((Just <$> lookup rec "a") <|> pure Nothing)
<*> ((Just <$> lookup rec "b") <|> pure Nothing)