Question

我正在尝试读取脚本文件然后处理并将其输出到html文件。在我的脚本文件中，每当有 @title（这是一个标题）时，我会在我的html中添加标签 [header]这是一个标题[/ header] 输出。所以我的方法是首先读取脚本文件，将内容写入字符串，处理字符串，然后将字符串写入html文件。

在其他方面识别@title，我需要在字符串中逐个字符地阅读。当我读'@'时，我需要检测下一个字符，看看它们是不是。

问题：如何在Haskell中遍历字符串（这是一个char列表）？

Answer 1

您可以使用简单的递归技巧，例如

findTag [] = -- end of list code.
findTag ('@':xs)
  | take 5 xs == "title" = -- your code for @title
  | otherwise            = findTag xs
findTag (_:xs) = findTag xs

所以基本上你只是模式匹配，如果下一个字符（列表的头部）是'@'然后你检查接下来的5个字符是否形成“标题”。如果是这样，您可以继续解析代码。如果下一个字符不是'@'，你只需继续递归。列表为空后，您将进入第一个模式匹配。

其他人可能有更好的解决方案。

我希望这能回答你的问题。

编辑：

为了更灵活一点，如果你想找到一个特定的标签，你可以这样做：

findTag [] _ = -- end of list code.
findTag ('@':xs) tagName
  | take (length tagName) xs == tagName = -- your code for @title
  | otherwise = findTag xs
findTag (_:xs) _ = findTag xs

如果你这样做

findTag text "title"

您将专门查找标题，并且您始终可以将标记名更改为您想要的任何内容。

另一个编辑：

findTag [] _ = -- end of list code.
findTag ('@':xs) tagName
  | take tLength xs == tagName = getTagContents tLength xs
  | otherwise = findTag xs
  where tLength = length tagName
findTag (_:xs) _ = findTag xs

getTagContents :: Int -> String -> String
getTagContents len = takeWhile (/=')') . drop (len + 1)

说实话，它变得有点乱，但这就是发生的事情：

首先删除tagName的长度，然后再删除一个用于打开括号的长度，然后使用takeWhile将字符移到结束括号之前完成。

Answer 2

显然你的问题属于解析类别。正如Daniel Wagner明智地指出的那样，出于可维护性的原因，你最好通过解析器来接近它。

另一件事是，如果您想有效地处理文本数据，最好使用Text代替String。

以下是使用Attoparsec解析器库解决问题的方法：

-- For autocasting of hardcoded strings to `Text` type
{-# LANGUAGE OverloadedStrings #-}

-- Import a way more convenient prelude, excluding symbols conflicting 
-- with the parser library. See
-- http://hackage.haskell.org/package/classy-prelude
import ClassyPrelude hiding (takeWhile, try)
-- Exclude the standard Prelude
import Prelude ()
import Data.Attoparsec.Text

-- A parser and an inplace converter for title
title = do
  string "@title("
  r <- takeWhile $ notInClass ")"
  string ")"
  return $ "[header]" ++ r ++ "[/header]"

-- A parser which parses the whole document to parts which are either
-- single-character `Text`s or modified titles
parts = 
  (try endOfInput >> return []) ++
    ((:) <$> (try title ++ (singleton <$> anyChar)) <*> parts)

-- The topmost parser which concats all parts into a single text
top = concat <$> parts

-- A sample input
input = "aldsfj@title(this is a title)sdlfkj@title(this is a title2)"

-- Run the parser and output result
main = print $ parseOnly top input

此输出

Right "aldsfj[header]this is a title[/header]sdlfkj[header]this is a title2[/header]"

P.S。 ClassyPrelude重新实现++作为Monoid的{{1}}的别名，因此您可以将其替换为mappend，mappend或<>'{ {1}}如果你愿意的话。

Answer 3

对于模式搜索和替换，您可以使用 streamEdit。

import Replace.Megaparsec
import Text.Megaparsec
import Text.Megaparsec.Char

title :: Parsec Void String String
title = do
    void $ string "@title("
    someTill anySingle $ string ")"

editor t = "[header]" ++ t ++ "[/header]"

streamEdit title editor " @title(this is a title) "

" [header]this is a title[/header] "

Haskell：遍历字符串/文本文件

3 个答案: