如何在字符串列表中的给定位置搜索Char,然后将它们分成子列表?

时间:2014-03-31 21:14:00

标签: string list haskell filter

如果听起来真的让人感到困惑,我很抱歉,但我无法解释它。我正在尝试从用户那里获取一个字符,并从一个字库中过滤出在特定位置具有给定字符的单词。例如,如果给出的字符是'e':

---- matches ["ally","cool","good"]
-e-- matches ["beta","deal"]
e--e matches ["else"]
--e- matches ["flew","ibex"]
---e matches ["hope"]

然后我需要获取最大的列表并将其作为新的单词库返回,然后重复直到只剩下一个单词。对我来说,包裹我的头很困难。有什么提示吗?

2 个答案:

答案 0 :(得分:1)

虽然许多人的首选是正则表达式,但如果没有它们,这项任务可以很容易地完成。首先,您需要决定数据类型。我会用一个比一个字符串更容易使用的形式来表示我的模式:

data Character = Wildcard | Literal Char deriving (Eq, Show)
type Pattern = [Character]

buildPattern :: String -> Pattern
buildPattern [] = []
buildPattern ('-':rest) = Wildcard : buildPattern rest
buildPattern (x:rest) = Literal x : buildPattern rest

-- buildPattern "-e---" = [Wildcard, Literal 'e', Wildcard, Wildcard, Wildcard]

现在我们需要构建一种方法来匹配模式的字符串,这是问题的真正原因,我会故意留下一些漏洞让你填写

match :: Pattern -> String -> Bool
match [] "" = True   -- An empty pattern matches an empty string
match [] _  = False  -- An empty pattern doesn't match a non-empty string
match (Wildcard:pat) (_:rest) = ???
match (Literal c:pat) (x:rest)
    | c == x    = ???
    | otherwise = False

我会告诉你要弄清楚要放在那些洞里的东西。请记住,Wildcard应匹配任何字符,而Literal应仅匹配该字符。你必须在这里使用递归,但这并不太难。如果你遇到困难,请评论并告诉我你有多远。


现在,您也可以在不创建新数据类型的情况下解决这个问题,只需使用内置的Haskell函数。像

这样的东西
type Pattern = String

match :: Pattern -> String -> Bool
match pat str = and [c == x | (c, x) <- zip pat str, c /= '-']

(这不是故意的,它不会检查长度是否相同。)

但是,我建议不要这样做。为什么?如果您突然要求您的模式还需要处理--[ea]-形式以匹配thenthan,那该怎么办?使用数据类型表示,您可以轻松地将其扩展为

import Data.List (span)

data Character = Wildcard | Literal Char | Class [Char] deriving (Eq, Show)
type Pattern = [Character]

buildPattern :: String -> Pattern
buildPattern [] = []
buildPattern ('-':pat) = Wildcard : buildPattern pat
buildPattern ('[':pat) = Class chars : buildPattern (drop 1 rest)
    where (chars, rest) = span (/= ']')
    -- span (/= ']') "ea]-" == ("ea", "]-")
    -- essentially splits at a condition, but we want to drop the ']'
    -- off of it so I used (drop 1 rest)

match :: Pattern -> String -> Bool
match [] "" = True
match [] _  = False
match (Wildcard :pat) (_:rest) = ...
match (Literal c:pat) (x:rest) = ...
match (Class  cs:pat) (x:rest) = ...

您可以非常轻松地继续构建模式语言,以支持许多不同类型的模式。


如果您想使用Map来存储您的单词库,您可以执行类似

的操作
-- This is recommended because many functions in Data.Map conflict with built-ins
import qualified Data.Map as M
import Data.Maybe (fromMaybe)

-- code from above goes here

wordBank :: M.Map Int [String]
wordBank = M.fromList
    [ (1, ["a"])
    , (2, ["as", "at", "or", "on", "in", "is"])
    , (3, ["and", "the", "and", "are", "dog", "cat"])
    , (4, ["then", "than", "that", "bath", "from", "moon")
    -- ...
    ]

wordsOfLength :: Int -> [String]
wordsOfLength len = fromMaybe [] $ M.lookup len wordBank
-- Default to an empty list if the lookup fails (i.e. len = -1)

wordsMatching :: Pattern -> [String]
wordsMatching pat =
    let possibles = wordsOfLength $ length pat
    in filter (match pat) possibles
-- Or just
-- wordMatching pat = filter (match pat) $ wordsOfLength $ length pat

答案 1 :(得分:0)

另一种方法是使用Data.List.Split库。有一个非常酷的函数叫splitOn。所以如果你有一个模式字符串

[ghci] let pat = "e--e"
[ghci] import Data.List.Split
[ghci] splitOn "e" pat
Loading package split-0.2.2 ... linking ... done.
["","--",""]
[ghci] 

请注意,您只需要数组中值的长度。因此,如果您创建一个函数来获取...中每个项目的长度。

[ghci] let f c str = map length $ splitOn [c] str

然后,此功能可用于:

[ghci] f 'e' pat
[0,2,0]
[ghci] f 'e' "else"
[0,2,0]
[ghci] f 'e' "beta"
[1,2]

现在列表可以直接比较如下......

[ghci] [0,2,0] == [0,2,0]
True
[ghci] [0,2,0] == [0,2,1]
False
[ghci] 

所以你的匹配函数应该采用模式和字符以及另一个字符串并返回分组的长度匹配与否。

[ghci] let g c pat str = (f c pat) == (f c str)
[ghci] g 'e' pat "else"
True
[ghci] g 'e' pat "whatever"
False

最后,您可以部分应用此函数并映射到字符串列表...

[ghci] map (g 'e' pat) $ words "cafeen else I die!!"
[False,True,False,False]

或匹配任何其他模式......

[ghci] map (g 'e' "-e--") $ words "This beta version wont deal with everything"
[False,True,False,False,True,False,False]

还有......

[ghci] map (g 'e' "----") $ words "This beta version wont deal with everything"
[True,False,False,True,False,True,False]

编辑:

Haskell中的字典是Data.Map模块的一部分。字典包含名称 - 值对。现在Haskell中的函数是一流的值。因此,字典中的命名参数可以是函数。因此,您需要设置一个命名条件列表,并将函数作为其值......

您可以创建条件词典,如下所示:

[ghci] import Data.Map
[ghci] let someDict = fromList [("C1",  Data.List.map (g 'e' "----") . words), 
                                ("C2",  Data.List.map (g 'e' "-e--") . words)]

然后你可以lookup一个函数然后调用它。 (请注意,由于此函数将在Maybe范围内,因此您需要将其应用为Functor ...

[ghci] import Control.Applicative
[ghci] Data.Map.lookup "C2" someDict <*> Just "This beta version wont deal with everything"
Just [False,True,False,False,True,False,False]

希望这会有所帮助......