我想在Haskell中硬编码地图。我至少可以看到三种方法:
使用多个方程:
message 200 = "OK"
message 404 = "Not found"
...
使用case
表达式:
message s = case s of
200 -> "OK"
404 -> "Not found"
实际使用Map
。
哪种方法最有效呢?一种解决方案比其他解决方案更快,为什么? 前两个解决方案是否相同? (编译器会生成相同的代码吗?) 推荐的方式是什么(更容易阅读)?
(请注意,我在我的示例中使用Int
,但这不是必需的。键也可能是String
s所以我对这两种情况都感兴趣。)
答案 0 :(得分:6)
Int
上的模式匹配发生在O(log(n))
时间,就像地图查找一样。
考虑以下代码,按ghc -S
module F (
f
) where
f :: Int -> String
f 0 = "Zero"
f 1 = "One"
f 2 = "Two"
f 3 = "Three"
f 4 = "Four"
f 5 = "Five"
f 6 = "Six"
f 7 = "Seven"
f _ = "Undefined"
编译的汇编代码是
.text
.align 4,0x90
.long _F_f_srt-(_sl8_info)+0
.long 0
.long 65568
_sl8_info:
.Lcma:
movl 3(%esi),%eax
cmpl $4,%eax
jl .Lcmq
cmpl $6,%eax
jl .Lcmi
cmpl $7,%eax
jl .Lcme
cmpl $7,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_cm7_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmc:
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clB_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcme:
cmpl $6,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_cm3_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmg:
cmpl $4,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clV_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmi:
cmpl $5,%eax
jl .Lcmg
cmpl $5,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clZ_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmk:
cmpl $2,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clN_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmm:
testl %eax,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clF_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmo:
cmpl $1,%eax
jl .Lcmm
cmpl $1,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clJ_str,0(%ebp)
jmp _stg_ap_n_fast
.Lcmq:
cmpl $2,%eax
jl .Lcmo
cmpl $3,%eax
jl .Lcmk
cmpl $3,%eax
jne .Lcmc
movl $_ghczmprim_GHCziCString_unpackCStringzh_closure,%esi
movl $_clR_str,0(%ebp)
jmp _stg_ap_n_fast
.text
.align 4,0x90
.long _F_f_srt-(_F_f_info)+0
.long 65541
.long 0
.long 65551
.globl _F_f_info
_F_f_info:
.Lcmu:
movl 0(%ebp),%esi
movl $_sl8_info,0(%ebp)
testl $3,%esi
jne .Lcmx
jmp *(%esi)
.Lcmx:
jmp _sl8_info
这是对整数参数进行二进制搜索。 .Lcma
分支在< 4然后< 6然后< 7。第一次比较转到.Lcmq
,其分支在< 2然后< 3。第一个比较是.Lcmo
,分支在< 1。
使用ghc -O2 -S
,我们得到了这个,我们可以看到相同的模式:
.text
.align 4,0x90
.long _F_zdwf_srt-(_F_zdwf_info)+0
.long 65540
.long 0
.long 33488911
.globl _F_zdwf_info
_F_zdwf_info:
.LcqO:
movl 0(%ebp),%eax
cmpl $4,%eax
jl .Lcr6
cmpl $6,%eax
jl .LcqY
cmpl $7,%eax
jl .LcqU
cmpl $7,%eax
jne .LcqS
movl $_F_f1_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.LcqS:
movl $_F_f9_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.LcqU:
cmpl $6,%eax
jne .LcqS
movl $_F_f2_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.LcqW:
cmpl $4,%eax
jne .LcqS
movl $_F_f4_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.LcqY:
cmpl $5,%eax
jl .LcqW
cmpl $5,%eax
jne .LcqS
movl $_F_f3_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.Lcr0:
cmpl $2,%eax
jne .LcqS
movl $_F_f6_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.Lcr2:
testl %eax,%eax
jne .LcqS
movl $_F_f8_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.Lcr4:
cmpl $1,%eax
jl .Lcr2
cmpl $1,%eax
jne .LcqS
movl $_F_f7_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.Lcr6:
cmpl $2,%eax
jl .Lcr4
cmpl $3,%eax
jl .Lcr0
cmpl $3,%eax
jne .LcqS
movl $_F_f5_closure,%esi
addl $4,%ebp
andl $-4,%esi
jmp *(%esi)
.section .data
.align 4
.align 1
_F_f_srt:
.long _F_zdwf_closure
.data
.align 4
.align 1
.globl _F_f_closure
_F_f_closure:
.long _F_f_info
.long 0
.text
.align 4,0x90
.long _F_f_srt-(_srh_info)+0
.long 0
.long 65568
_srh_info:
.Lcrv:
movl 3(%esi),%eax
movl %eax,0(%ebp)
jmp _F_zdwf_info
.text
.align 4,0x90
.long _F_f_srt-(_F_f_info)+0
.long 65541
.long 0
.long 65551
.globl _F_f_info
_F_f_info:
.Lcrz:
movl 0(%ebp),%esi
movl $_srh_info,0(%ebp)
testl $3,%esi
jne _srh_info
jmp *(%esi)
如果我们将原始代码更改为
f :: Int -> String
f 1 = "Zero"
f 2 = "One"
f 3 = "Two"
f 4 = "Three"
f 5 = "Four"
f 6 = "Five"
f 7 = "Six"
f 8 = "Seven"
f _ = "Undefined"
分支是< 5,< 7,< 8,< 5< 3< 4< 4等等,所以它可能基于对参数进行排序而这样做。我们可以通过加扰数字,甚至在它们之间增加间距来测试:
f :: Int -> String
f 20 = "Zero"
f 80 = "One"
f 70 = "Two"
f 30 = "Three"
f 40 = "Four"
f 50 = "Five"
f 10 = "Six"
f 60 = "Seven"
f _ = "Undefined"
果然,分支仍在<50,<70,<80,<50,<30,<40等
答案 1 :(得分:3)
答案 2 :(得分:1)
case ... of
和多个方程完全相同。他们编译到同一个核心。对于大多数情况,你应该这样做:
import qualified Data.Map as Map
message =
let
theMap = Map.fromList [ (200, "OK"), (404, "Not found"), ... ]
in
\x -> Map.lookup x theMap
这只构造一次地图。如果您不喜欢Maybe String
返回类型,则可以将fromMaybe
应用于结果。
对于少数情况(特别是如果它们是整数),如果编译器可以将它转换为跳转表,则case语句可能更快。
在理想的世界中,ghc会自动选择正确的版本。