嗨,我是Haskell和功能编程的新手。
我想在字符串中找到词干词,并在删除词干后显示单词和单词。
eg if the string is : "he is a good fisher man. he is fishing and cached two fish"
output should be : [(fisher,fish), (fishing, fish), (cached, catch)]
我尝试这样做
hasEnding endings w = any (`isSuffixOf` w) endings
wordsWithEndings endings ws = filter (hasEnding endings) ws
wordsEndingEdOrIng ws = wordsWithEndings ["ed","ing","er"] . words $ ws
stemming :: String -> String
stemming [] = []
stemming (x:"ing") = [x]
stemming (x:"ed") = [x]
stemming (x:"er") = [x]
stemming (x:xs) = x : stemming xs
removestemmings :: String -> String
removestemmings = unwords . map stemming . words
findwords = wordsEndingEdOrIng .removestemmings
这个不起作用..这个结果为 [] 。
任何人都可以帮我这样做。
答案 0 :(得分:1)
您的findwords
功能正如您所说的那样完成。首先,它从每个单词中删除词干,然后过滤掉每个没有词干的词,然后是所有词。
你要做的是删除所有的词干,用原始单词列表压缩列表,然后过滤原始单词有词干的列表:
-- Operate on a single word only.
hasStem :: String -> Bool
hasStem w = or $ zipWith isSuffixOf ["ed", "ing", "er"] $ repeat w
-- Let this function work on a list of words instead
removeStemmings :: [String] -> [String]
removeStemmings = map stemming
-- findWords now takes a sentence, splits into words, remove the stemmings,
-- zips with the original word list, and filters that list by which had stems
findWords :: String -> [(String, String)]
findWords sentence = filter (hasStem . fst) . zip ws $ removeStemmings ws
where ws = words sentence
> findWords "he is a good fisher man. he is fishing and catched two fish"
[("fisher","fish"),("fishing","fish"),("catched","catch")]