Question

目前我有一个需要花费大量时间才能运行的功能，让我们称之为somefunction。 somefunction返回一个元组，为简单起见，请说明(String,Int)。我想以一种简洁的方式输出这个函数的结果，所以我写了下面的代码（不完全是这个代码，但为了简单起见，我们会说它是）：

main :: IO ()
main = do
 putStr$ (\ (a,b) -> a ++ "\ninteger: " ++ (show b) ++ "\n" ) (somefunction args)

所以我开始了我的代码。过了一会儿，它打印了元组的a部分，没有b。经过另一段很长一段时间后，它会输出b部分。

为什么在输出元组的第二部分之前会有这么多延迟？这里发生了什么？

在我的情况下，somefunction是以下代码中的函数debugbrainflak：

module Interpreter (debugbrainflak) where

pop :: (Integral a) => [a] -> a
pop [] = 0
pop (x:_) = x

rest :: (Integral a) => [a] -> [a]
rest [] = []
rest (_:x) = x

topadd :: [Integer] -> Integer -> [Integer]
topadd [] x = [x]
topadd (a:[]) x = [a+x]
topadd (a:b) x = (a+x):b

ir :: [Char] -> Integer -> [Char]
ir x 0 = ""
ir ('{':x) y = "{" ++ (ir x (y+1))
ir ('}':x) y = "}" ++ (ir x (y-1))
ir (a:x)   y = [a] ++ (ir x   y  )

interior :: [Char] -> [Char]
interior x = init (ir x 1)

ex :: [Char] -> Integer -> [Char]
ex x 0 = x
ex ('{':x) y = ex x (y+1)
ex ('}':x) y = ex x (y-1)
ex  (a:x)  y = ex x y

exterior :: [Char] -> [Char]
exterior x = ex x 1

---

dbf :: [Char] -> ([Integer],[Integer],[Integer],Int) -> ([Integer],[Integer],[Integer],Int)
dbf []          (x,y,z,c)= (x,y,z,c)
dbf ('(':')':a) (x,y,z,c)= dbf a (x,y,((pop z+1):rest z),c+1)
dbf ('<':'>':a) (x,y,z,c)= dbf a (y,x,z,c+1)
dbf ('{':'}':a) (x,y,z,c)= dbf a ((rest x),y,(topadd z (pop x)),c+1)
dbf ('[':']':a) (x,y,z,c)= dbf a (x,y,(topadd z (toInteger (length x))),c+1)
dbf ('(':a)     (x,y,z,c)= dbf a (x,y,(0:z),c+1)
dbf ('<':a)     (x,y,z,c)= dbf a (x,y,(0:z),c+1)
dbf ('[':a)     (x,y,z,c)= dbf a (x,y,(0:z),c+1)
dbf (')':a) (x,y,(h:z),c)= dbf a ((h:x),y,(topadd z h),c+1)
dbf (']':a) (x,y,(h:z),c)= dbf a (x,y,(topadd z (-h)),c+1)
dbf ('>':a) (x,y,(_:z),c)= dbf a (x,y,z,c+1)
dbf ('{':a)      t     = dbf (exterior a) (drun (interior a) t)
dbf (_:a)        t     = dbf a t

drun :: [Char] -> ([Integer],[Integer],[Integer],Int) -> ([Integer],[Integer],[Integer],Int)
drun s ([],y,z,c)  = ([],y,z,c)
drun s (0:x,y,z,c) = (0:x,y,z,c)
drun s x         = drun s (dbf s x)

--

bl :: [Char] -> [Char] -> Bool
bl [] [] = True
bl [] _  = False
bl ('(':x) y   = bl x (')':y) 
bl ('[':x) y   = bl x (']':y) 
bl ('<':x) y   = bl x ('>':y) 
bl ('{':x) y   = bl x ('}':y) 
bl  (a:x) []
 | elem a ")]>}" = False
 | otherwise     = bl x []
bl (a:x) (b:y)
 | elem a ")]>}" = (a == b) && (bl x y)
 | otherwise     = bl x (b:y)

balanced :: [Char] -> Bool
balanced x = bl x []

clean :: [Char] -> [Char]
clean [] = []
clean ('#':'{':xs) = clean (exterior xs)
clean (x:xs)
 | elem x "()[]<>{}" = x:(clean xs)
 | otherwise         = clean xs

debugbrainflak :: [Char] -> [Integer] -> ([Integer], Int)
debugbrainflak s x
 | balanced s = (\(a,_,_,d) -> (a,d)) (dbf (clean s) (x,[],[],0))
 | otherwise  = error "Unbalanced braces."

如果您想要一个模拟您可以尝试的行为的测试用例（您可以通过增加或减少数量来更改所需的时间）：

debugbrainflak "({({}[()])}{})" [999999]

Answer 1

如果没有看到somefunction，很难确切地说是什么，但总的来说，GHC¹不会像这样重复工作。但是，由于Haskell是懒惰的，只要它已经完成足够的工作来生成字符串的开头，它就会打印出字符串的部分。所以，如果你有

slow1 :: Args -> String
slow2 :: Args -> Int

somefunction :: Args -> (String, Int)
somefunction args = (slow1 args, slow2 args)

然后GHC将首先强制slow1 args，打印出来，打印出"\ninteger: "，然后强制slow2 args，show它，打印出来。（然后打印尾随的"\n"。）

您可以按如下方式测试：

main :: IO ()
main =
  let (a,b) = somefunction args
  in putStr $ a `seq` b `seq` (a ++ "\ninteger: " ++ show b ++ "\n")

此处x `seq` y等于y，但强制y会强制 x和y y。这意味着a `seq` b `seq` ...将确保在返回包含它们的字符串之前完全计算a和b。

使用seq的此代码应该一次打印出所有内容 - 但它可能会等待您的旧代码所做的一切，只是在所有前面而不是在打印a和{之间{1}}。

¹Haskell报告没有详细说明有关评估的详细信息，所以我将在这个答案中谈论GHC特别做了什么 - 这通常是人们正在寻找的。

Answer 2

Not a direct answer, but a bit more about what's going on here: It looks like your c parameter in dbf is not being forced, which means that it's building up as a big thunk like 1+(1+(1+(1+(1+(1+(... and only actually doing the addition when you show it, which takes time if it's large. The way through this is to make sure that c gets evaluated at each recursive step of dbf.

With the way you have written it this could be a little awkward. One way would be to add as the first equation of dbf:

dbf _ (_,_,_,c) | c `seq` False = undefined

which feels like a bit of a hack, but allows you to experiment without changing every line of your interpreter.

In the long run I'd probably define a data type for the state instead of a tuple, and make it a strict field：

data InterpState = InterpState {
    stack1 :: [Integer],
    stack2 :: [Integer],
    stack3 :: [Integer],
    instrCount :: !Int
  }

为什么我的函数在不同的时间输出结果的不同部分？

2 个答案: