基于stackExchange上的另一个线程,我尝试使用Data.Set.intersection而不是Prelude在一个简短的整数列表(每个<5个元素)上相交来改善重复交集的运行时间。在GHCI控制台中尝试一个简单的例子,我看到了显着的性能差异:
intersect [1..10000] [5000..200000]
(16.04 secs, 21919236 bytes)
而
S.intersection (S.fromList [1..10000]) (S.fromList [5000..200000])
(0.25 secs, 125337020 bytes)
我希望修改我的算法并体验类似的奇迹性能提升与小整数列表相交:
--Check takes a list of integers and then checks if another integer n is
--"swappable" Swappable integers are two integers that, when concatenated, are prime. Any new sets created with n are returned
check ::[[Integer]] -> Integer -> [[Integer]]
check [] n = [[n]]
check (x:xs) n
|nonswappable == [] = (x++[n]):(check xs (n))
|otherwise = (check (removecandidates xs) n)
where
--nonswappable returns null if the input n isswappable with everything
--isswappable:: Integer->Integer->Bool This function concatenates two numbers and checks if they are prime.
nonswappable = snd(partition (isswappable n) x)
--removecandidates strikes out candidate lists in the list containing a nonswappable number in the set just checked
removecandidates list = filter ((==[]).(intersect nonswappable)) list
相关的个人资料:
total time = 5.30 secs (5304 ticks @ 1000 us, 1 processor)
total alloc = 2,676,617,968 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
powm Main 29.3 36.2
isswappable Main 26.3 25.3
check.removecandidates Main 15.9 18.5
probmrisPrime.randomAs Main 6.9 3.8
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
primes Main 110 1 0.0 0.0 0.2 0.5
sieve Main 111 573 0.2 0.5 0.2 0.5
findsets Main 104 1 0.0 0.0 99.8 99.5
sets Main 105 571 0.2 0.2 99.8 99.5
check Main 106 55704 0.2 0.1 99.6 99.3
check.removecandidates Main 126 52920 15.9 18.5 15.9 18.5
check.nonswappable Main 107 55135 0.4 0.2 83.5 80.7
使用Data.Set改变我的预期与上面类似:
--Check takes a list of integers and then checks if another integer n is
--"swappable" Swappable integers are two integers that, when concatenated, are prime. Any new sets created with n are returned
check ::[[Integer]] -> Integer -> [[Integer]]
check [] n = [[n]]
check (x:xs) n
|nonswappable == [] = (x++[n]):(check xs (n))
|otherwise = (check (removecandidates xs) n)
where
--nonswappable returns null if the input n is swappable with everything
--isswappable:: Integer->Integer->Bool This function concatenates two numbers and checks if they are prime.
nonswappable = snd(partition (isswappable n) x)
--removecandidates strikes out candate lists in the list containing a nonswappable number in the set just checked
removecandidates list = filter ((==[]).(S.toList.S.intersection setnonswappable.setx)) list
where
setnonswappable = S.fromList nonswappable
setx lst = S.fromList lst
相关的探查器详细信息:
total time = 7.36 secs (7361 ticks @ 1000 us, 1 processor)
total alloc = 3,019,801,528 bytes (excludes profiling overheads)
COST CENTRE MODULE %time %alloc
check.removecandidates Main 30.7 21.8
powm Main 21.6 32.1
isswappable Main 20.6 22.4
check.removecandidates.setx Main 6.8 5.9
probmrisPrime.randomAs Main 4.9 3.4
individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc
MAIN MAIN 63 0 0.0 0.0
primes Main 112 1 0.0 0.0 0.3 0.4
sieve Main 113 573 0.3 0.4 0.3 0.4
findsets Main 106 1 0.0 0.0 99.7 99.6
sets Main 107 571 0.3 0.2 99.7 99.6
check Main 108 55704 0.2 0.1 99.4 99.4
check.removecandidates Main 128 52920 30.7 21.8 37.5 27.8
check.removecandidates.setx Main 130 4949177 6.8 5.9 6.8 5.9
check.removecandidates.setnonswappable Main 129 52446 0.0 0.0 0.0 0.0
性能下降到调用Data.Set.intersection的函数(check.removecandidates)上升到最大成本中心的程度!您可以提供哪些建议来查明严重的性能问题。此外,是否有一种完全不同的方法可以更有效地测试许多小整数列表的交集。
根据评论,还添加了具有所有相关功能的完全可执行代码。
module Main where
import Data.List
import System.Random
import qualified Data.Set as S
primes::[Integer]
primes = sieve (2:[3,5..])
sieve []=[]
sieve (x:xs) = x:sieve[y|y<-xs, (y `rem` x)/=0]
--isswappabble using probablistic prime
isswappable:: Integer->Integer->Bool
isswappable x y = (probmrisPrime 3 (read ((show x) ++ (show y))) == "prob prime") &&
((probmrisPrime 3 (read ((show y) ++ (show x))) == "prob prime"))
--miller rabine probablistic prime test
--Input: n > 3, an odd integer to be tested for primality;
--Input: k, a parameter that determines the accuracy of the test
--random seed for miller-rabine test
mygen = mkStdGen 32123432141312332130
probmrisPrime k n
| n `mod` 2 == 0 = "composite"
|otherwise = witnessloop n randomAs
where randomAs = take k (randomRs (2, (n-2)) (mygen))::[Integer]
witnessloop _ [] = "prob prime"
witnessloop n (hd:lst)
| x == 1 || x==n-1 = witnessloop n (lst)
| otherwise = checks (s-1) x
where x = powm hd d n 1
d = millerrabined n
s = millerrabines n
checks 0 _ = "composite"
checks s x
| newx == 1 = "composite"
| newx == n-1 = witnessloop n (lst)
| otherwise = checks (s-1) newx
where newx = (x^2) `mod` n
millerrabines num=factor2 (num-1)
millerrabined num=quot (num-1) (2^(millerrabines num))
factor2 num
| num `rem` 2 ==0 = 1 + factor2 (quot num 2)
| otherwise = 0
powm :: Integer -> Integer -> Integer -> Integer -> Integer
powm b 0 m r = r
powm b e m r | e `mod` 2 == 1 = powm (b * b `mod` m) (e `div` 2) m (r * b `mod` m)
powm b e m r = powm (b * b `mod` m) (e `div` 2) m r
sets :: Int->[[Integer]]
--sets 1 = [[3]]
--checks all the previous sets with the next prime number and appends any new sets with n to the list
sets 2 = [[3]]
--start with prime == 7 since any prime with 5 will always divide by 5
sets n = check (take (n-2) findsets) (primes!!n)
check [] n = [[n]]
check (x:xs) n
|nonswappable == [] = (x++[n]):(check xs (n))
|otherwise = (check (removecandidates xs) n)
where
--returns null if the input n is swappable with everything
nonswappable = snd(partition (isswappable n) x)
removecandidates list = [x| x<-list, intersect nonswappable x == [] ]
{-replacement logic using Data.Set
removecandidates list = filter ((==[]).(intersect nonswappable)) list
removecandidates list = filter ((==[]).(S.toList.S.intersection setnonswappable.setx)) list
where
setnonswappable = S.fromList nonswappable--
setx lst = S.fromList lst -}
--generate all the sets
findsets= concat(map sets [2..])
--find sets of certain length
findGroups list len = [x| x<-list, length x >= len]
main = do
putStrLn $ show ( take 1 (findGroups (findsets) 5))