Question

我编写了一个函数，可以清理在Haskell中处理的字数。它需要能够将-更改为空格（即四十五个变为四十五个）并删除所有其他非字母。我可以递归地定义它，但我真的想做一些更清洁的事情。

clean :: String -> String
clean "" = ""
clean ('-':cs) = ' ' : clean cs
clean (c:cs)
    | isLetter c  = c : clean cs
    | otherwise   = clean cs

这导致我定义了一个自定义过滤器，并根据对Data.List.Split的注释定义了this answer的替换，因为我已经在使用Data.List.Split了。

clean :: String -> String
clean = filter (\c -> isLetter c || c == ' ') . replace "-" " " . filter (/= ' ')
  where
    replace :: String -> String -> String -> String
    replace old new = intercalate new . splitOn old

这个版本整体上更加混乱。此外，此版本不会删除原始字符串中的空格。是一个不同的惯例或内置的东西，允许我这样做，使用干净的单线？

Answer 1

处理列表最强大的功能之一是concatMap（a.k.a。>>=）。您可以像这样编写clean函数：

clean :: String -> String
clean = concatMap (\c -> if c == '-' then " " else [c | isLetter c])

Answer 2

这里有两件事：

你需要删除所有不是字母或连字符的内容;和
接下来我们用空格替换连字符。

因此，我们可以使用包含filter和replace

  import Data.Bool(bool)
  import Data.Char(isLetter)

   map (\x -> bool ' ' x (x /= '-')) . filter (\x -> isLetter x || x == '-')
-- \____________ __________________/   \______________ ____________________/
--              v                                     v
--             (2)                                   (1)

我们可以使用list comprehension进行映射和过滤，如：

import Data.Bool(bool)
import Data.Char(isLetter)

clean l = [bool ' ' x (x /= '-') | x <- l, isLetter x || x == '-']

我们也可以使用单个函数，例如执行concatMap：

import Data.Bool(bool)
import Data.Char(isLetter)

concatMap (\x -> bool (bool "" " " (x == '-')) [x] (isLetter x))

所以这里我们连接x到""的映射，以防x不是字母和连字符，或者是空字符串，以防它不是字母或连字符，或[x]（所以1-char字符串），如果x是一个字母。

Answer 3

这是列表monad中do表示法的一个非常好的用例。

clean :: String -> String
clean string = do
  character <- string         -- For each character in the string...
  case character of
    '-'            -> " "     -- If it’s a dash, replace with a space.
    c | isLetter c -> pure c  -- If it’s a letter, return it.
    _              -> []      -- Otherwise, discard it.

这最终是concatMap的简单语法糖。如果您愿意，也可以pure c写[c];不太重要的是，" "可以写成pure ' '或[' ']。作为替代方案，您可能会发现MultiWayIf扩展程序更具可读性：

if
  | character == '-'   -> " "
  | isLetter character -> pure character
  | otherwise          -> []

最后，请注意isLetter对所有Unicode字母都返回true。如果您只关心ASCII，则可能需要使用isAscii c && isLetter c或isAsciiUpper c || isAsciiLower c。

如何在Haskell中替换多个案例？

3 个答案: