Question

我有一个包含以下布局的文件：

TABLE name_of_table

COLUMNS first_column 2nd_column [..] n-th_column

VALUES 1st_value 2nd_value [...]第n个值

VALUES yet_another_value ...继续

另一个表重复开始......

我想让这个文本文件重新排列，所以我不必在每个VALUES行前面输入TABLE和COLUMNS，产生：

TABLE name_of_table COLUMNS first_column [..]第n列VALUES 1st_value

TABLE name_of_table COLUMNS first_column [..]第n列VALUES yetanother_value

我需要在此输入输入并重新排列多行，因此将整个文本文件作为带有 hGetContents 的字符串似乎是合适的，产生如下字符串：

TABLE name_of_table COLUMNS first_column [..] n-th_column VALUES 1st_value [..] n-th_value VALUES another_value [..] yet_another VALUES ......另一个表......栏目......价值[ ......] VALUES ......

我尝试过使用嵌套的递归和递归的情况。这给了我一个我需要帮助的困境：

1）我需要递归以避免无休止的案例嵌套问题。

2）使用递归，我不能作为替代添加字符串的前面部分，因为递归只引用我的字符串的尾部！

说明问题：



myStr::[[Char]]->[[Char]]
myStr []   = []
myStr one  =
 case (head one) of
  "table" -> "insert into":(head two):columnRecursion (three) ++ case (head four) of 
                                                                  "values" -> (head four):valueRecursion (tail three) ++ myStr (tail four)
                                                                  _        -> case head (tail four) of
                                                                               "values" -> (head (tail four):myStr (tail (tail four))
                                                                               _        -> 
 where two   = tail one
       three = tail two
       four  = tail three

columnRecursion::[[Char]] -> [[Char]]
columnRecursion [] = []
columnRecursion cool =
 case (head cool) of
  "columns"         -> "(":columnRecursion (tail cool) 
  "values"          -> [")"]
  _                 -> (head cool):columnRecursion (tail cool)

valueRecursion::[[Char]] -> [[Char]]
valueRecursion foo =
 case head foo of
  "values" -> "insert into":(head two):columnRecursion (three) ++ valueRecursion (tail foo)
  "table"  -> []
  "columns"-> []
  _        -> (head foo):valueRecursion (tail foo)

我结束了FIRSTPART，值得再次获取FIRSTPART，以创建FIRSTPART，VALUES，FIRSTPART，VALUES，FIRSTPART，VALUES。

通过在valueRecursion中引用myStr来尝试这样做显然超出了范围。

怎么做??

Answer 1

对我来说，这种问题只会超过使用实际解析工具阈值。以下是Attoparsec的快速工作示例：

import Control.Applicative
import Data.Attoparsec (maybeResult)
import Data.Attoparsec.Char8
import qualified Data.Attoparsec.Char8 as A (takeWhile)
import qualified Data.ByteString.Char8 as B
import Data.Maybe (fromMaybe)

data Entry = Entry String [String] [[String]] deriving (Show)

entry = Entry <$> table <*> cols <*> many1 vals
items = sepBy1 (A.takeWhile $ notInClass " \n") $ char ' '
table = string (B.pack "TABLE ") *> many1 (notChar '\n') <* endOfLine
cols = string (B.pack "COLUMNS ") *> (map B.unpack <$> items) <* endOfLine
vals = string (B.pack "VALUES ")  *> (map B.unpack <$> items) <* endOfLine

parseEntries :: B.ByteString -> Maybe [Entry]
parseEntries = maybeResult . flip feed B.empty . parse (sepBy1 entry skipSpace)

还有一点机器：

pretty :: Entry -> String
pretty (Entry t cs vs)
  = unwords $ ["TABLE", t, "COLUMNS"]
  ++ cs ++ concatMap ("VALUES" :) vs

layout :: B.ByteString -> Maybe String
layout = (unlines . map pretty <$>) . parseEntries

testLayout :: FilePath -> IO ()
testLayout f = putStr . fromMaybe [] =<< layout <$> B.readFile f

鉴于此输入：

TABLE test
COLUMNS a b c
VALUES 1 2 3
VALUES 4 5 6

TABLE another
COLUMNS x y z q
VALUES 7 8 9 10
VALUES 1 2 3 4

我们得到以下信息：

*Main> testLayout "test.dat" 
TABLE test COLUMNS a b c VALUES 1 2 3 VALUES 4 5 6
TABLE another COLUMNS x y z q VALUES 7 8 9 10 VALUES 1 2 3 4

这似乎是你想要的？

Answer 2

此答案为literate Haskell，因此您可以将其复制并粘贴到名为table.lhs的文件中，以获得正常工作的程序。

从少量导入开始

> import Control.Arrow ((&&&))
> import Control.Monad (forM_)
> import Data.List (intercalate,isPrefixOf)
> import Data.Maybe (fromJust)

并说我们代表一张包含以下记录的表：

> data Table = Table { tblName :: String
>                    , tblCols :: [String]
>                    , tblVals :: [String]
>                    }
>   deriving (Show)

也就是说，我们记录表的名称，列名列表和列值列表。

输入中的每个表都以TABLE开头的行开头，因此将输入中的所有行相应地分成块：

> tables :: [String] -> [Table]
> tables [] = []
> tables xs = next : tables ys
>   where next = mkTable (th:tt)
>         (th:rest) = dropWhile (not . isTable) xs
>         (tt,ys) = break isTable rest
>         isTable = ("TABLE" `isPrefixOf`)

将输入组合成表格后，给定表格的名称是TABLE行上的第一个单词。列名称是COLUMNS行上显示的所有字词，列值来自VALUES行：

> mkTable :: [String] -> Table
> mkTable xs = Table name cols vals
>   where name = head $ fromJust $ lookup "TABLE" tagged
>         cols = grab "COLUMNS"
>         vals = grab "VALUES"
>         grab t = concatMap snd $ filter ((== t) . fst) tagged
>         tagged = map ((head &&& tail) . words)
>                $ filter (not . null) xs

给定Table记录，我们通过在一行上以适当的顺序粘贴名称，值和SQL关键字来打印它：

> main :: IO ()
> main = do
>   input <- readFile "input"
>   forM_ (tables $ lines input) $
>     \t -> do putStrLn $ intercalate " " $
>                 "TABLE"   : (tblName t)  :
>                ("COLUMNS" : (tblCols t)) ++
>                ("VALUES"  : (tblVals t))

鉴于缺乏想象力的输入

TABLE name_of_table

COLUMNS first_column 2nd_column [..] n-th_column

VALUES 1st_value 2nd_value [...] n-th value

VALUES yet_another_value ... go on

TABLE name_of_table

COLUMNS first_column 2nd_column [..] n-th_column

VALUES 1st_value 2nd_value [...] n-th value

VALUES yet_another_value ... go on

输出

$ runhaskell table.lhs
TABLE name_of_table COLUMNS first_column 2nd_column [..] n-th_column VALUES 1st_value 2nd_value [...] n-th value yet_another_value ... go on
TABLE name_of_table COLUMNS first_column 2nd_column [..] n-th_column VALUES 1st_value 2nd_value [...] n-th value yet_another_value ... go on

重新排列包含可变长度重复模式的字符串

2 个答案: