我正在写一个遗传算法来生成字符串“helloworld”。但是当n为10,000或更多时,evolve函数会产生堆栈溢出。
module Genetics where
import Data.List (sortBy)
import Random (randomRIO)
import Control.Monad (foldM)
class Gene g where
-- How ideal is the gene from 0.0 to 1.0?
fitness :: g -> Float
-- How does a gene mutate?
mutate :: g -> IO g
-- How many species will be explored?
species :: [g] -> Int
orderFitness :: (Gene g) => [g] -> [g]
orderFitness = reverse . sortBy (\a b -> compare (fitness a) (fitness b))
compete :: (Gene g) => [g] -> IO [g]
compete pool = do
let s = species pool
variants <- (mapM (mapM mutate) . map (replicate s)) pool
let pool' = (map head . map orderFitness) variants
return pool'
evolve :: (Gene g) => Int -> [g] -> IO [g]
evolve 0 pool = return pool
evolve n pool = do
pool' <- compete pool
evolve (n - 1) pool'
对于species pool = 8
,8个基因库重复到8个组。每组变异,并选择每组最适合进一步进化(回到8个基因)。
答案 0 :(得分:3)
如果您对性能感兴趣,我会使用快速随机数生成器,例如:
其次,compete
看起来非常可疑,因为它完全是懒惰的,尽管建造了一些潜在的大型结构。尝试使用deepseq锤子重写它有点严格:
import Control.DeepSeq
compete :: (Gene g, NFData g) => [g] -> IO [g]
compete pool = do
let s = species pool
variants <- (mapM (mapM mutate) . map (replicate s)) pool
let pool' = (map head . map orderFitness) variants
pool' `deepseq` return pool'
但这些东西都不需要在IO中,(单独的问题)。像Rand
monad这样的东西可能是more appropriate。
答案 1 :(得分:2)
感谢Don的deepseq
建议,我能够将问题范围缩小到mapM mutate
,因为太多了。新版本有mutate'
,使用seq
来阻止上传。
module Genetics where
import Data.List (maximumBy)
import Random (randomRIO)
class Gene g where
-- How ideal is the gene from 0.0 to 1.0?
fitness :: g -> Float
-- How does a gene mutate?
mutate :: g -> IO g
-- How many species will be explored in each round?
species :: [g] -> Int
best :: (Gene g) => [g] -> g
best = maximumBy (\a b -> compare (fitness a) (fitness b))
-- Prevents stack overflow
mutate' :: (Gene g) => g -> IO g
mutate' gene = do
gene' <- mutate gene
gene' `seq` return gene'
drift :: (Gene g) => [[g]] -> IO [[g]]
drift = mapM (mapM mutate')
compete :: (Gene g) => [g] -> IO [g]
compete pool = do
let islands = map (replicate (species pool)) pool
islands' <- drift islands
let representatives = map best islands'
return representatives
evolve :: (Gene g) => Int -> [g] -> IO [g]
evolve 0 pool = return pool
evolve n pool = compete pool >>= evolve (n - 1)
答案 2 :(得分:1)
您可以使用(map head . map orderFitness)
和单orderFitness
,而不是使用sortBy
maximumBy
map
deepseq
。这不会节省太多(因为你从O(n log n)到O(n)并且可能从消除双映射获得另一个因子2),但是至少有点简单和更有效。你也可以摆脱逆转的呼吁。
我怀疑这可以在没有head . sortBy
的情况下解决问题,但它应该是一种改进。
编辑:如果标准库和GHC是完美的,那么maximumBy
将生成与map head . map sortBy
完全相同的代码,而map (head . sortBy)
会生成与sortBy
相同的代码,遗憾的是这些代码都不是事情在实践中可能是真实的。 maximumBy
会倾向于做一堆额外的内存分配,因为它是一种分而治之的算法。组合地图是您有时获得的优化,但不应指望。
更重要的是,使用{{1}}更具说明性。更容易看到代码的作用以及需要多长时间。利用优化也应该更容易,因为我们知道目标是什么,而不仅仅是我们如何得到它。