第一次更新

Question

为了熟悉unsafePerformIO（如何使用它以及何时使用它），我实现了一个用于生成唯一值的模块。

这就是我所拥有的：

module Unique (newUnique) where

import Data.IORef
import System.IO.Unsafe (unsafePerformIO)

-- Type to represent a unique thing.
-- Show is derived just for testing purposes.
newtype Unique = U Integer
  deriving Show

-- I believe this is the Haskell'98 derived instance, but
-- I want to be explicit, since its Eq instance is the most
-- important part of Unique.
instance Eq Unique where
  (U x) == (U y) = x == y

counter :: IORef Integer
counter = unsafePerformIO $ newIORef 0

updateCounter :: IO ()
updateCounter = do
  x <- readIORef counter
  writeIORef counter (x+1)

readCounter :: IO Integer
readCounter = readIORef counter

newUnique' :: IO Unique
newUnique' = do { x <- readIORef counter
                ; writeIORef counter (x+1)
                ; return $ U x }

newUnique :: () -> Unique
newUnique () = unsafePerformIO newUnique'

令我高兴的是，名为Data.Unique的{{3}}选择了与我相同的数据类型;另一方面，他们选择newUnique :: IO Unique类型，但如果可能的话，我希望不在IO之内。

这种实施是否危险？是否可能导致GHC改变使用它的程序的语义？

Answer 1

将unsafePerformIO视为对编译器的承诺。它说：“我保证你可以把这个IO动作视为纯粹的价值而不会出错”。它很有用，因为有时你可以为使用不纯操作实现的计算构建一个纯接口，但编译器不可能验证何时是这种情况;相反unsafePerformIO允许你把手放在心上，并发誓你已经证实不纯的计算实际上是纯粹的，所以编译器可以简单地相信它是。

在这种情况下，承诺是错误的。如果newUnique是纯函数，则let x = newUnique () in (x, x)和(newUnique (), newUnique ())将是等效表达式。但是你希望这两个表达式有不同的结果;在一种情况下具有相同Unique值的一对副本，在另一种情况下具有一对两个不同的Unique值。使用您的代码，实际上没有办法说出这两个表达式的含义。只能通过考虑程序在运行时执行的实际操作顺序来理解它们，并且当您使用unsafePerformIO时，对此的控制正是您放弃的。 unsafePerformIO表示并不重要是否将任一表达式编译为newUnique或两次执行，并且Haskell的任何实现都可以自由选择它们喜欢的任何内容它遇到这样的代码的时间。

Answer 2

unsafePerformIO的目的是当你的函数在内部执行某些操作时，但没有观察者会注意到的副作用。 ~~例如，一个带有向量，复制它，对副本进行快速排序，然后返回副本的函数。~~（参见注释）这些操作中的每一个都有副作用，因此在{{ 1}}，但总体结果没有。

IO必须是newUnique操作，因为它每次都会生成不同的内容。这基本上是IO的定义，它意味着动词，而不是形容词的函数。函数将始终为相同的参数返回相同的结果。这称为参照透明度。

有关IO的有效使用，请参阅this question。

Answer 3

是的，您的模块很危险。考虑这个例子：

module Main where
import Unique

main = do
  print $ newUnique ()
  print $ newUnique ()

编译并运行：

$ ghc Main.hs
$ ./Main
U 0
U 1

编译优化并运行：

$ \rm *.{hi,o}
$ ghc -O Main.hs
$ ./Main
U 0
U 0

嗯，哦！

添加{-# NOINLINE counter #-}和{-# NOINLINE newUnique #-} 没有帮助，所以我真的不确定这里发生了什么......

第一次更新

看看GHC核心，我看到@LambdaFairy是正确的常量子表达式消除（CSE）导致我的newUnique () 要解除的表达方式。但是，阻止CSE -fno-cse和。{ 将{-# NOINLINE counter #-}添加到Unique.hs是不够的使优化程序打印与未优化程序相同！ ~~特别是，似乎counter内联即使 NOINLINE~~ 中的Unique.hs pragma。有谁理解为什么？

我上传了以下核心文件的完整版本 https://gist.github.com/ntc2/6986500

使用main进行编译时-O的（相关）核心：

main3 :: Unique.Unique
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 20 0}]
main3 = Unique.newUnique ()

main2 :: [Char]
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 40 0}]
main2 =
  Unique.$w$cshowsPrec 0 main3 ([] @ Char)

main4 :: [Char]
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 40 0}]
main4 =
  Unique.$w$cshowsPrec 0 main3 ([] @ Char)

main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
[GblId,
 Arity=1,

 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=1, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [0] 110 0}]
main1 =
  \ (eta_B1 :: State# RealWorld) ->
    case Handle.Text.hPutStr2
           Handle.FD.stdout main4 True eta_B1
    of _ { (# new_s_atQ, _ #) ->
    Handle.Text.hPutStr2
      Handle.FD.stdout main2 True new_s_atQ
    }

请注意newUnique ()来电被解除并被绑定 main3。

现在用-O -fno-cse进行编译时：

main3 :: Unique.Unique
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 20 0}]
main3 = Unique.newUnique ()

main2 :: [Char]
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 40 0}]
main2 =
  Unique.$w$cshowsPrec 0 main3 ([] @ Char)

main5 :: Unique.Unique
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 20 0}]
main5 = Unique.newUnique ()

main4 :: [Char]
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 40 0}]
main4 =
  Unique.$w$cshowsPrec 0 main5 ([] @ Char)

main1
  :: State# RealWorld
     -> (# State# RealWorld, () #)
[GblId,
 Arity=1,

 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=1, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [0] 110 0}]
main1 =
  \ (eta_B1 :: State# RealWorld) ->
    case Handle.Text.hPutStr2
           Handle.FD.stdout main4 True eta_B1
    of _ { (# new_s_atV, _ #) ->
    Handle.Text.hPutStr2
      Handle.FD.stdout main2 True new_s_atV
    }

请注意，main3和main5是两个独立的newUnique () 调用

然而：

rm *.hi *o Main
ghc -O -fno-cse Main.hs && ./Main
U 0
U 0

查看此修改后的Unique.hs的核心：

module Unique (newUnique) where

import Data.IORef
import System.IO.Unsafe (unsafePerformIO)

-- Type to represent a unique thing.
-- Show is derived just for testing purposes.
newtype Unique = U Integer
  deriving Show

{-# NOINLINE counter #-}
counter :: IORef Integer
counter = unsafePerformIO $ newIORef 0

newUnique' :: IO Unique
newUnique' = do { x <- readIORef counter
                ; writeIORef counter (x+1)
                ; return $ U x }

{-# NOINLINE newUnique #-}
newUnique :: () -> Unique
newUnique () = unsafePerformIO newUnique'

~~似乎counter被内联为counter_rag，尽管NOINLINE pragma~~ （第二次更新：错误！counter_rag未标记使用[InlPrag=NOINLINE]，但这并不意味着它已被内联;相反，counter_rag只是counter的名字; NOINLINE的{{1}}受到尊重：

newUnique

这里发生了什么？

第二次更新

用户@errge figured it out。我们看到，仔细观察上面粘贴的最后一个核心输出 counter_rag :: IORef Type.Integer counter_rag = unsafeDupablePerformIO @ (IORef Type.Integer) (lvl1_rvg `cast` (Sym (NTCo:IO <IORef Type.Integer>) :: (State# RealWorld -> (# State# RealWorld, IORef Type.Integer #)) ~# IO (IORef Type.Integer))) [...] lvl3_rvi :: State# RealWorld -> (# State# RealWorld, Unique.Unique #) [GblId, Arity=1] lvl3_rvi = \ (s_aqi :: State# RealWorld) -> case noDuplicate# s_aqi of s'_aqj { __DEFAULT -> case counter_rag `cast` (NTCo:IORef <Type.Integer> :: IORef Type.Integer ~# STRef RealWorld Type.Integer) of _ { STRef var#_au4 -> case readMutVar# @ RealWorld @ Type.Integer var#_au4 s'_aqj of _ { (# new_s_atV, a_atW #) -> case writeMutVar# @ RealWorld @ Type.Integer var#_au4 (Type.plusInteger a_atW lvl2_rvh) new_s_atV of s2#_auo { __DEFAULT -> (# s2#_auo, a_atW `cast` (Sym (Unique.NTCo:Unique) :: Type.Integer ~# Unique.Unique) #) } } } } lvl4_rvj :: Unique.Unique lvl4_rvj = unsafeDupablePerformIO @ Unique.Unique (lvl3_rvi `cast` (Sym (NTCo:IO <Unique.Unique>) :: (State# RealWorld -> (# State# RealWorld, Unique.Unique #)) ~# IO Unique.Unique)) Unique.newUnique [InlPrag=NOINLINE] :: () -> Unique.Unique Unique.newUnique = \ (ds_dq8 :: ()) -> case ds_dq8 of _ { () -> lvl4_rvj }的大部分身体已经浮出水面最高级别为Unique.newUnique。但是，lvl4_rvj是常量表达式，而不是一个函数，所以它只被评估一次，通过lvl4_rvj解释重复的U 0输出。

事实上：

main

我不明白rm *.hi *o Main ghc -O -fno-cse -fno-full-laziness Main.hs && ./Main U 0 U 1优化到底是什么确实 - GHC docs 谈论浮动让绑定，但-ffull-laziness的主体没有似乎是一个让绑定 - 但我们至少可以比较上面的核心用lvl4_rvj生成的核心，看到现在身体没有抬起：

-fno-full-laziness

此处Unique.newUnique [InlPrag=NOINLINE] :: () -> Unique.Unique Unique.newUnique = \ (ds_drR :: ()) -> case ds_drR of _ { () -> unsafeDupablePerformIO @ Unique.Unique ((\ (s_as1 :: State# RealWorld) -> case noDuplicate# s_as1 of s'_as2 { __DEFAULT -> case counter_rfj `cast` (<NTCo:IORef> <Type.Integer> :: IORef Type.Integer ~# STRef RealWorld Type.Integer) of _ { STRef var#_avI -> case readMutVar# @ RealWorld @ Type.Integer var#_avI s'_as2 of _ { (# ipv_avz, ipv1_avA #) -> case writeMutVar# @ RealWorld @ Type.Integer var#_avI (Type.plusInteger ipv1_avA (__integer 1)) ipv_avz of s2#_aw2 { __DEFAULT -> (# s2#_aw2, ipv1_avA `cast` (Sym <(Unique.NTCo:Unique)> :: Type.Integer ~# Unique.Unique) #) } } } }) `cast` (Sym <(NTCo:IO <Unique.Unique>)> :: (State# RealWorld -> (# State# RealWorld, Unique.Unique #)) ~# IO Unique.Unique)) }再次对应counter_rfj，我们会看到差异是counter的身体没被抬起，所以参考更新（Unique.newUnique，readMutVar）代码将是每次调用writeMutVar时运行。

我已将the gist更新为包含新的Unique.newUnique核心文件。早期的核心文件是在另一台计算机上生成的，所以有些是次要的这里的差异与-fno-full-laziness无关。

Answer 4

另见另一个例子：

module Main where
import Unique

helper :: Int -> Unique
-- noinline pragma here doesn't matter
helper x = newUnique ()

main = do
  print $ helper 3
  print $ helper 4

使用此代码，效果与ntc2的示例相同：使用-O0更正，但使用-O更正。但是在这段代码中没有“消除共同的子表达式”。

这里实际发生的是newUnique ()表达式“浮出”到顶层，因为它不依赖于函数的参数。在GHC中，这是-ffull-laziness（默认使用-O开启，可以使用-O -fno-full-laziness关闭。）

所以代码实际上变成了这个：

helperworker = newUnique ()
helper x = helperworker

这里的helperworker是一个只能评估一次的thunk。

如果您在命令行中添加-fno-full-laziness，则已使用已推荐的NOINLINE pragma，然后按预期工作。

我滥用不安全的PerformIO吗？

4 个答案:

第一次更新

第二次更新