嗨我正在生成一个1000 X 1000个节点的稀疏DAG,每个节点有~4个边(方向)。以下是相关代码:Full Code with imports 我正在解决的问题的值在[0-1500]之间。我现在硬编码1501作为上限值。我试图计算DAG中最长的边缘路径。但是,这些细节并不是我问题的直接部分:
我的问题与如何在haskell中编写算法时判断force
或类似结构的用法有关:
type OutGoingEdges = Map.Map NodeId [ NodeId ]
type NodesData = Map.Map NodeId Node
type NodeId = Int
data DAG = AdjList
{ outGoingEdges :: OutGoingEdges
, nodesData :: NodesData
} deriving (Eq, Show)
makeDAG :: DAGDataPath -> IO (DAG, SourceNodes)
makeDAG filepath = do
listOfListOfInts <- makeInteger <$> readLines filepath
let [width, height] = head listOfListOfInts
numNodes = width * height
rows = (replicate width 1501) : (drop 1 listOfListOfInts) ++ [(replicate width 1501)]
heightsWithNodeIdsRows = force . fmap (\ (row, rowId) -> fmap (\ (height, colId) -> (height, rowId * width + colId)) $ zip row [1..]) $ zip rows [1..]
emptyGraph = AdjList Map.empty $ Map.fromList (fmap (\(h, nid) -> (nid, Node h)) . concat . tail . init $ heightsWithNodeIdsRows)
emptyNodesWithEdges = Set.empty
threeRowsInOneGo = zip3 heightsWithNodeIdsRows (drop 1 heightsWithNodeIdsRows) (drop 2 heightsWithNodeIdsRows)
(graph, nodesWithInEdges) = DL.foldl' makeGraph (emptyGraph, emptyNodesWithEdges) threeRowsInOneGo
sourceNodes = Set.difference (Set.fromList . Map.keys . nodesData $ graph) nodesWithInEdges
-- traceShow [take 10 . Map.keys . nodesData $ graph] (return (Set.toList sourceNodes))
-- traceShow graph (return (Set.toList sourceNodes))
-- traceShow sourceNodes (return (Set.toList sourceNodes))
return (graph, force $ Set.toList sourceNodes)
where
makeGraph (graphTillNow, nodesWithInEdges) (prevRow, row, nextRow) =
let updownEdges = zip3 prevRow row nextRow
(graph', nodesInEdges') = addEdges (graphTillNow, nodesWithInEdges) updownEdges
leftRightEdges = zip3 ((1501, 0) : row) (drop 1 row) (drop 2 row)
(graph'', nodesInEdges'') = addEdges (graph', nodesInEdges') leftRightEdges
下一行很有意思...... graph''
是DAG
而nodesInEdges''
是Set NodeId
in (graph'', nodesInEdges'')
addEdges (g, n) edges =
DL.foldl' (\ (!g', !n') ((p, pId), (c, cId), (n, nId)) ->
let (g'', n'') = if c > p
then (makeEdge cId pId g', Set.insert pId n')
else (g', n')
(g''', n''') = if c > n
then (makeEdge cId nId g'', Set.insert nId n'')
else (g'', n'')
in (g''', n'''))
(g, n)
edges
在分析时我发现,如果我使用(force graph'', force nodesInEdges'')
而不是(graph'', nodesInEdges'')
,我的内存使用量会从3 GB减少到600 MB。但程序的运行时间从37秒增加到69秒。这些数字来自time
命令并查看活动监视器。我还检查了分析,结果也是类似的。
我正在编译配置文件构建:
stack build --executable-profiling --library-profiling --ghc-options="-fprof-auto -auto-all -caf-all -fforce-recomp -rtsopts" --file-watch
我有ghc-7.10.3和堆栈1.1.2。
我认为force
一次又一次地遍历数据结构。
如果图已经完全评估,可以告诉force
不要查看图表吗?
我可以使用其他策略吗?
示例输入:
2 2 -- width height
1 2
3 4
输出:
3
输出是图表中最长路径的长度。 [4 - &gt; 2 - &gt; 1]即[(1,1),(0,1),(0,0)]。只是提醒一下,程序的正确性不是问题; 空间/时间效率是。感谢