使用haskell删除词干词

时间:2013-12-30 16:11:34

标签: haskell functional-programming

我是haskell和功能编程的新手..

我的目标是从给定的字符串中删除词干..

eg: input is : "he is fishing and catched two fish"
    output is : "he is fish and catch two fish"

我尝试使用以下代码执行此操作。它只删除“ed”,但不删除“ing”。

removeStemming :: String -> String
removeStemming xs
  | "ing" `isSuffixOf` xs = take (length xs - 3) xs
  | "ed"  `isSuffixOf` xs = take (length xs - 2) xs
  | otherwise             = xs

任何人都可以帮我修复此错误。请..

3 个答案:

答案 0 :(得分:3)

问题在于,当您要将removeStemming应用于每个单词时,将> unwords $ map removeStemming $ words "he is fishing and catched two fish" "he is fish and catch two fish" 应用于整个字符串。你可以做到

words

unwords函数在空格上拆分字符串并返回所有单词的列表,unwords . words执行相反的操作(注意:通常id不等同于{{ 1}})。您可以将removeStemming函数按原样映射到words text的输出,然后将其与unwords

一起重新加入

答案 1 :(得分:2)

我已在original answer中说过,如果你一次只处理一个单词会更容易:

  

一次解决一个词,这使事情变得更容易:

removeStemming :: String -> String
removeStemming []        = []
removeStemming (x:"ing") = [x]
removeStemming (x:"ed")  = [x] --new, since ed wasn't part of the last question
removeStemming (x:xs)    = x : removeStemming xs

如果你看一下removeStemming的定义,你会注意到它只删除最后词干。因此,removeStemming仅适用于单个词。

如果要将其应用于多个单词,则需要将其应用于每个单词:

removeAllStemmings :: String -> String
removeAllStemmings = unwords . map removeStemming . words

在此之后,您可以使用removeAllStemmings "he is fishing and catched two fish"

答案 2 :(得分:1)

如果我们希望删除所有“ing”和“ed”,让我们改变Zeta的回答:

removeStemming :: String -> String
removeStemming []                    = []
removeStemming ('i':'n':'g':' ':xs)  = removeStemming xs
removeStemming ('e':'d':' '    :xs)  = removeStemming xs
removeStemming "ing"                 = []
removeStemming "ed"                  = []
removeStemming (x:xs)                = x : removeStemming xs

如果我们不想将“响铃”减少为“r”,我会将removeStemming "ing"removeStemming ('i':'n':'g':' ':xs)分开