使用haskell显示词干词和词干

时间:2014-01-09 14:39:36

标签: haskell functional-programming

嗨,我是Haskell和功能编程的新手。

我想在字符串中找到词干词,并在删除词干后显示单词和单词。

eg if the string is : "he is a good fisher man. he is fishing and cached two fish"
output should be : [(fisher,fish), (fishing, fish), (cached, catch)]

我尝试这样做

hasEnding endings w = any (`isSuffixOf` w) endings
wordsWithEndings endings ws = filter (hasEnding endings) ws
wordsEndingEdOrIng ws = wordsWithEndings ["ed","ing","er"] . words $ ws


stemming :: String -> String
stemming []        = []
stemming (x:"ing") = [x]
stemming (x:"ed")  = [x] 
stemming (x:"er")  = [x]
stemming (x:xs)    = x : stemming xs

removestemmings :: String -> String
removestemmings = unwords . map stemming . words


findwords = wordsEndingEdOrIng .removestemmings

这个不起作用..这个结果为 []

任何人都可以帮我这样做。

1 个答案:

答案 0 :(得分:1)

您的findwords功能正如您所说的那样完成。首先,它从每个单词中删除词干,然后过滤掉每个没有词干的词,然后是所有词。

你要做的是删除所有的词干,用原始单词列表压缩列表,然后过滤原始单词有词干的列表:

-- Operate on a single word only.
hasStem :: String -> Bool
hasStem w = or $ zipWith isSuffixOf ["ed", "ing", "er"] $ repeat w

-- Let this function work on a list of words instead
removeStemmings :: [String] -> [String]
removeStemmings = map stemming

-- findWords now takes a sentence, splits into words, remove the stemmings,
-- zips with the original word list, and filters that list by which had stems
findWords :: String -> [(String, String)]
findWords sentence = filter (hasStem . fst) . zip ws $ removeStemmings ws
    where ws = words sentence

> findWords "he is a good fisher man. he is fishing and catched two fish"
[("fisher","fish"),("fishing","fish"),("catched","catch")]