我使用Attoparsec编写了一个简单的文本STL(标准细分库)解析器。 STL包含一组方面。每个构面包含一个法线和三角形顶点。典型的STL文件可能很大(约100 Mb或更多)。
STL文件格式是
solid
facet normal nx ny nz
outer loop
vertex x1 y1 z1
vertex x2 y2 z2
vertex x3 y3 z3
endfacet
endsolid
解析器代码在这里。可以在STLReader
找到完整的实施-- | Point represents a vertex of the triangle
data Point a = Point !a !a !a deriving Show
-- | Vector for a given class
data Vector a = Vector !a !a !a deriving Show
-- | Parse coordinate triplet.
coordinates :: (Fractional a) => Text -> (a -> a -> a -> b) -> Parser b
coordinates s f = do
skipSpace
string s
!x <- coordinate
!y <- coordinate
!z <- coordinate
return $! f x y z
where
coordinate = skipWhile isHorizontalSpace *> fmap realToFrac double
{-# INLINE coordinates #-}
type RawFacet a = (Vector a, Point a, Point a, Point a)
-- | Parse a facet. The facet comprises of a normal, and three vertices
facet :: Fractional a => Parser (RawFacet a)
facet = (,,,) <$> beginFacet
<* (skipSpace *> "outer loop")
<*> vertexPoint
<*> vertexPoint
<*> vertexPoint
<* (skipSpace <* "endloop" <* endFacet )
<?> "facet"
where
beginFacet = skipSpace <* "facet" *> coordinates "normal" Vector
endFacet = skipSpace <* string "endfacet"
vertexPoint = coordinates "vertex" Point
{-# INLINE facet #-}
rawFacets :: Fractional a => Parser [RawFacet a]
rawFacets = beginSolid *> many' facet <* endSolid
where
solidName = option "default" (skipWhile isHorizontalSpace *> fmap T.pack (many1 $ satisfy isAlphaNum) )
beginSolid = skipSpace <* "solid" *> solidName <?> "start solid"
endSolid = skipSpace <* "endsolid" <?> "end solid"
-- | Read text STL file. STL extensions for color etc. are not supported in this version.
readTextSTL :: Fractional a => FilePath -> IO (Either String [RawFacet a])
readTextSTL path = liftM (Al.eitherResult . Al.parse rawFacets) (TIO.readFile path)
main :: IO Int
main = do
(path:_) <- getArgs
putStrLn $ "Parsing STL file: " ++ path
s <- readTextSTL path
putStrLn "Parsing complete"
case s of
Left error -> putStrLn error
Right s -> putStrLn $ "Num facets : " ++ show (length s)
return 0
我使用Meshlab提供的'c'解析器对此代码进行基准测试。当我用69Mb扫描测试时,meshlab在大约13秒内完成了这项工作,而用attoparsec花了22秒。因此虽然attoparsec使我能够比Parsec更快地解析(parsec大约是36秒),但我还有很长的路要走。
如何进一步改进此解析器?