Question

我想知道从Haskell输入中读取的数据中获取元组的最佳方法是什么。当输入由包含以空格分隔的整数的几行组成时，我经常在竞争性编程中遇到此问题。 Here is an example：

要读取整数行，我使用以下函数：

readInts :: IO [Int]
readInts = fmap (map read . words) getLine

然后，我将这些列表转换为适当大小的元组：

readInts :: IO (Int, Int, Int, Int)
readInts = fmap ((\l -> (l !! 0, l !! 1, l !! 2, l !! 3)) . map read . words) getLine

这种方法对我来说似乎不是很习惯。

以下语法更具可读性~~，但仅适用于2元组~~：

readInts :: IO (Int, Int)
readInts = fmap ((\[x, y] -> (x, y)) . map read . words) getLine

（编辑：如注释中所述，以上解决方案通常适用于n元组）。

有没有一种惯用的方法可以从整数列表中初始化元组而不必在Haskell中使用!!？另外，是否有其他方法可以处理这种类型的输入？

Answer 1

如何？

readInts :: IO (<any tuple you like>)
readInts = read . ("(" ++) . (++ ")") . intercalate "," . words <$> getLine

Answer 2

鉴于上下文是“竞争性编程”（我只是模糊地意识到这是一个概念），所以我不确定以下内容是否提供了特别具有竞争力的替代方法，但是恕我直言，我认为使用起来很惯用几种可用的解析器组合器之一。

base软件包随附一个名为Text.ParserCombinators.ReadP的模块。您可以使用它来分析链接文章中的输入文件的方法：

module Q57693986 where

import Text.ParserCombinators.ReadP

parseNumber :: ReadP Integer
parseNumber = read <$> munch1 (`elem` ['0'..'9'])

parseTriple :: ReadP (Integer, Integer, Integer)
parseTriple =
  (,,) <$> parseNumber <*> (char ' ' *> parseNumber) <*> (char ' ' *> parseNumber)

parseLine :: ReadS (Integer, Integer, Integer)
parseLine = readP_to_S (parseTriple <* eof)

parseInput :: String -> [(Integer, Integer, Integer)]
parseInput = concatMap (fmap fst . filter (null . snd)) . fmap parseLine . lines

您可以对此输入文件使用parseInput：

这是解析该文件的GHCi会话：

*Q57693986> parseInput <$> readFile "57693986.txt"
[(1,3,10),(2,5,8),(10,11,0),(0,0,0)]

每个parseLine函数都会生成一个与解析器匹配的元组列表。例如：

*Q57693986> parseLine "11 32 923"
[((11,32,923),"")]

元组的第二个元素是任何仍在等待解析的String。在上面的示例中，parseLine已完全消耗了该行，这是我期望的格式正确的输入，因此其余的String为空。

如果解析器可以使用不止一种方法来使用输入，则解析器将返回一个替代项列表，但同样，在上述示例中，由于行已被完全使用，因此仅存在一个建议的替代项。

parseInput函数会丢弃尚未完全使用的所有元组，然后仅选择其余所有元组的第一个元素。

这种方法经常为我解决诸如Advent of Code之类的难题，在这些难题中，输入文件的格式往往会很好。

Answer 3

这是一种生成解析器的方法，该解析器通常适用于任何具有适当大小的元组。它需要库generics-sop。

{-# LANGUAGE DeriveGeneric, DeriveAnyClass, 
             FlexibleContexts, TypeFamilies, TypeApplications #-}

import GHC.Generics
import Generics.SOP
import Generics.SOP (hsequence, hcpure,Proxy,to,SOP(SOP),NS(Z),IsProductType,All)
import Data.Char
import Text.ParserCombinators.ReadP
import Text.ParserCombinators.ReadPrec
import Text.Read

componentP :: Read a => ReadP a
componentP = munch isSpace *> readPrec_to_P readPrec 1

productP :: (IsProductType a xs, All Read xs) => ReadP a
productP = 
    let parserOutside = hsequence (hcpure (Proxy @Read) componentP)
     in Generics.SOP.to . SOP . Z <$> parserOutside

例如：

*Main> productP @(Int,Int,Int) `readP_to_S` " 1 2 3 "
[((1,2,3)," ")]

它允许具有不同类型的组件，只要它们都具有一个Read实例即可。

它还解析具有Generics.SOP.Generic实例的记录：

data Stuff = Stuff { x :: Int, y :: Bool } 
             deriving (Show,GHC.Generics.Generic,Generics.SOP.Generic)

例如：

*Main> productP @Stuff `readP_to_S` " 1 True"
[(Stuff {x = 1, y = True},"")]

Haskell中从IO数据进行元组初始化

3 个答案: