Question

我有一个包含记录数据的CSV文件，我想在Haskell中处理。 CSV文件中的数据采用十六进制格式。当我把它读入Haskell时，我有一些字符串，如＆＃34; 0xFF5FFFC8EC5FFEDF＆＃34;它代表8个字节的数据。

为了处理数据，我想将字符串转换为一种数据类型，这将允许我做一点比特（按位AND，OR和XOR）。然后，当我完成后，我想将最终结果转换回十六进制，以便将其写入文件。

在Haskell中这很容易吗？我应该看哪些模块？

Answer 1

您可以使用read来解析整数或浮点数。它位于Prelude中，因此您无需任何其他模块即可使用它。

尝试：

a = "0xFF5FFFC8EC5FFEDF"
b = read a::Double

（它给出b = 1.8401707840883393e19）

此外，对于解析CSV，您也可以自己创建函数。我在一周前写了一个简单的CSV解析器。

module CSVUtils
    ( parseCSV, showCSV
    , readCSV , writeCSV
    , colFields
    , Separator, Document
    , CSV      , Entry
    , Field
    )
where

import Data.Char
import Data.List
{-
A simple utility for working with CSV (comma-separated value) files. These
are simple textual files where fields are delimited with a character (usually a comma
or a semicolon). It is required that the CSV document is well-formed, i.e., that 
it contains an equal number of fields per row.
-}
type Separator = String
type Document = String
type CSV = [Entry]
type Entry = [Field]
type Field = String

doc = "John;Doe;15\nTom;Sawyer;12\nAnnie;Blake;20"
brokenDoc = "One;Two\nThree;Four;Five"
{-
(a) Takes a separator and a string representing a CSV document and returns a 
CSV representation of the document. 
-}
-- !! In the homework text is said Separator is going to be Char and now the type is String
-- !! so I'm just going to take head
parseCSV :: Separator -> Document -> CSV
parseCSV sep doc 
    | (head sep) `notElem` doc                     = error $ "The character '"++sep++"' does not occur in the text"
    | 1 /= length ( nub ( map length (lines doc))) = error $ "The CSV file is not well-formed"               
    | otherwise                                    = [splitOn sep wrd | wrd <- lines doc ]
{-
(b) Takes a separator and a CSV representation of
a document and creates a CSV string from it.
-}
showCSV :: Separator -> CSV -> Document
showCSV sep = init . unlines . map (intercalate sep)
{-
(c) Takes a CSV document and a field number
and returns a list of fields in that column.
-}
colFields :: Int -> CSV -> [Field]
colFields n csv = [ if length field > n 
                    then field !! n 
                    else error $ "There is no column "++(show n)++" in the CSV document" 

                    | field <- csv]
{-
(d) Takes a file path and a separator and returns the CSV representation of the file.
-}
readCSV :: Separator -> FilePath -> IO CSV
readCSV sep path = do
    file <- readFile path
    return $ parseCSV sep file

{-
(e) Takes a separator, a file path, and a CSV document and writes the document into a file.
The return type of writeCSV is a special case of IO { we need to wrap an impure
action, but do not actually have to return anything when writing. Thus, we
introduce (), or the unit type, which holds no information (consider it a 0-
tuple).
-}
writeCSV :: Separator -> FilePath -> CSV -> IO ()
writeCSV sep path csv = writeFile path (showCSV sep csv)

Answer 2

我将假设你的二进制数据可以是任意长度的。例如，如果您的二进制数据符合Int64。

，则可以简化操作

我建议使用以下库和模块：

cassava用于CSV解析
bytestring表示您的字符串类型
base16-bytestring转换为/来自十六进制字符串
Data.Bits用于对字节，字符，整数等进行逐位运算。

有关如何对ByteStrings执行按位运算的示例，请查看Haskell学院本教程的结尾：

https://www.fpcomplete.com/school/to-infinity-and-beyond/pick-of-the-week/bytestring-bits-and-pieces

有关如何使用cassava的示例，请查看源代码库的examples目录：

https://github.com/tibbe/cassava/tree/master/examples

在Haskell中操作十六进制数据

2 个答案: