通过字符串搜索

时间:2015-01-11 23:54:15

标签: haskell

我在一本书中找到了一个很好的例子,我正试图解决这个问题。我正在尝试编写一个名为“pointer”的函数,其签名为pointer :: String -> Int。它将带有看起来像[Int]的“指针”文本,然后返回找到的指针总数。

指针函数将检查的文本如下所示:

txt :: String
txt = "[1] and [2] are friends who grew up together who " ++
      "went to the same school and got the same degrees." ++
      "They eventually opened up a store named [2] which was pretty successful."

在命令行中,我们将按如下方式运行代码:

> pointer txt 
3

3表示找到的指针数。

我了解的是:

  • 我知道“单词”会将字符串分解为带有单词的列表。 示例:

      

    单词“所有这些苹果都在哪里?”

    [ “哪里”, “是”, “所有”, “的”, “这些”, “苹果?”]

  • 我得到“过滤器”将选择列表中的特定元素。 例如:

      

    过滤器(> 3)[1,5,6,4,3]

         [5,6,4]
    
  • 我得到“长度”将返回列表的长度

我认为我需要做什么:

Step 1) look at txt and then break it down into single words until you have a long list of words.
Step 2) use filter to examine the list for [1] or [2]. Once found, filter will place these pointers into an list.
Step 3) call the length function on the resulting list.

面临的问题:

我正在努力抓住我所知道的一切并实施它。

3 个答案:

答案 0 :(得分:1)

这是一个假设的ghci会议:

ghci> words txt
[ "[1]", "and", "[2]", "are", "friends", "who", ...]

ghci> filter (\w -> w == "[1]" || w == "[2]") (words txt)
[ "[1]", "[2]", "[2]" ]

ghci> length ( filter (\w -> w == "[1]" || w == "[2]") (words txt) )
3

您可以使用$运算符使最后一个表达式更具可读性:

length $ filter (\w -> w == "[1]" || w == "[2]") $ words txt

答案 1 :(得分:1)

如果您希望能够在字符串中找到[Int]类型的所有模式 - 例如[3],[465]等,而不仅仅是[1]和[2],最简单的方法是使用正则表达式:

{-# LANGUAGE NoOverloadedStrings #-}

import Text.Regex.Posix

txt :: String
txt = "[1] and [2] are friends who grew up together who " ++
      "went to the same school and got the same degrees." ++
      "They eventually opened up a store named [2] which was pretty successful."

pointer :: String -> Int
pointer source = source =~ "\\[[0-9]{1,}\\]"

我们现在可以运行:

pointer txt
> 3

答案 2 :(得分:1)

这适用于单个数字“指针”:

pointer :: String -> Int
pointer ('[':_:']':xs) = 1 + pointer xs
pointer (_:        xs) = pointer xs
pointer _              = 0

使用ie提供的解析器组合器可以更好地处理这个问题。 Parsec,但这可能有点矫枉过正。