Question

在早些时候看到关于文本文件索引的帖子后，我开始好奇，因为我正在学习如何使用Haskell（这不是很顺利），因为它是否有可能在Haskell中轻松实现相同的目标： How do I create a Java string from the contents of a file?

该课程的目标是阅读文件，按字母顺序排列文件中的单词，并显示这些单词出现的行号。

Haskell中的函数是否能够以类似的方式构建？

Answer 1

这是一个可以完成你需要的繁重工作的功能：

import Data.List

sortWords :: String -> [(Int,String)]
sortWords contents = sortBy (\(_,w1) (_,w2) -> compare w1 w2) ys
  where xs = zip [1..] (lines contents)
        ys = concatMap (\(n,line) -> zip (repeat n) (words line)) xs

唯一要做的就是编写一些简单的IO代码来读取文件并很好地打印结果。如果您在以下输入上运行此功能：

Here are some test
lines to see if this works...
What: about? punctuation!

你会得到这个：

(1,"Here")
(3,"What:")
(3,"about?")
(1,"are")
(2,"if")
(2,"lines")
(3,"punctuation!")
(2,"see")
(1,"some")
(1,"test")
(2,"this")
(2,"to")
(2,"works...")

Answer 2

这是一个Haskell解决方案

import Data.List (sortBy)
import Data.Function (on)

numberWordsSorted :: String -> [(String,Int)]
numberWordsSorted  = sortBy (compare `on` fst)
                   . concat 
                   . zipWith (\n xs -> [(x,n)|x<- xs]) [1..]   -- add line numbers
                   . map words
                   . lines

main = fmap numberWordsSorted (readFile "example.txt") >>= mapM_ print

如果您在example.txt上使用内容

运行它

the quick brown fox
jumps over the lazy dog

你得到了

("brown",1)
("dog",2)
("fox",1)
("jumps",2)
("lazy",2)
("over",2)
("quick",1)
("the",1)
("the",2)

如果您反而看到Data.Map而不是两行输出，则应使用the [1,2]代替配对。

Haskell完整的文本文件索引器

2 个答案: