Question

如果我的文件（csv）如下所示：

John,12323,New York, 2233

我用以下文件阅读了文件：

contents <- readFile "data.csv"

我的结果是String，我将其与splitOn分开：

["John","12323","New York","2233"]

如何只过滤此列表中的数字？

filter (=~ "regex") resultList

我已经尝试使用过滤方法，但它不起作用。

这就是我想要实现的目标：

[12323,2233]

Answer 1

import Data.Char

isInteger = all isDigit

onlyIntegers :: [String] -> [Integer]
onlyIntegers = map read . filter isInteger

Answer 2

使用像Cassava这样的CSV解析库：http://hackage.haskell.org/package/cassava

除了其他功能之外，它还对内置的带有错误处理的整数进行解码。

如果你想要举例，我有一个7,000 word post here就是关于CSV解析。

Answer 3

你可以使用正则表达式，但这会容易出错且速度慢。你基本上必须解析每个数字两次。相反，您可以使用结合内置函数的相对简单的解决方案：

import Text.Read (readMaybe)
import Data.Maybe (catMaybes)

extractInts :: [String] -> [Int]
extractInts = catMaybes . map readMaybe

更好的解决方案是使用像Cassava这样的CSV解析库，您可以在其中编写像

这样的数据结构

data MyRecord = MyRecord
    { name :: String
    , zipCode :: Int
    , city :: String
    , anotherField :: Int
    } deriving (Eq, Show)

instance FromRecord MyRecord where
    parseRecord v
        | length v == 4
            =   MyRecord
            <$> v .! 0
            <*> v .! 1
            <*> v .! 2
            <*> v .! 3
        | otherwise = mzero

然后，您可以使用Cassava中的decode函数为您提供比splitOn更高效的解析文件。

从文件中提取整数

3 个答案: