我正在Scala中编写一个小型的解释器,我在解析Scheme中的列表时遇到了问题。我的代码解析包含多个数字,标识符和布尔值的列表,但如果我尝试解析包含多个字符串或列表的列表,它会发生窒息。我错过了什么?
这是我的解析器:
class SchemeParsers extends RegexParsers {
// Scheme boolean #t and #f translate to Scala's true and false
def bool : Parser[Boolean] =
("#t" | "#f") ^^ {case "#t" => true; case "#f" => false}
// A Scheme identifier allows alphanumeric chars, some symbols, and
// can't start with a digit
def id : Parser[String] =
"""[a-zA-Z=*+/<>!\?][a-zA-Z0-9=*+/<>!\?]*""".r ^^ {case s => s}
// This interpreter only accepts numbers as integers
def num : Parser[Int] = """-?\d+""".r ^^ {case s => s toInt}
// A string can have any character except ", and is wrapped in "
def str : Parser[String] = '"' ~> """[^""]*""".r <~ '"' ^^ {case s => s}
// A Scheme list is a series of expressions wrapped in ()
def list : Parser[List[Any]] =
'(' ~> rep(expr) <~ ')' ^^ {s: List[Any] => s}
// A Scheme expression contains any of the other constructions
def expr : Parser[Any] = id | str | num | bool | list ^^ {case s => s}
}
答案 0 :(得分:3)
正如@Gabe正确指出的那样,你留下了一些未处理的白色空间:
scala> object SchemeParsers extends RegexParsers {
|
| private def space = regex("[ \\n]*".r)
|
| // Scheme boolean #t and #f translate to Scala's true and false
| private def bool : Parser[Boolean] =
| ("#t" | "#f") ^^ {case "#t" => true; case "#f" => false}
|
| // A Scheme identifier allows alphanumeric chars, some symbols, and
| // can't start with a digit
| private def id : Parser[String] =
| """[a-zA-Z=*+/<>!\?][a-zA-Z0-9=*+/<>!\?]*""".r
|
| // This interpreter only accepts numbers as integers
| private def num : Parser[Int] = """-?\d+""".r ^^ {case s => s toInt}
|
| // A string can have any character except ", and is wrapped in "
| private def str : Parser[String] = '"' ~> """[^""]*""".r <~ '"' <~ space ^^ {case s => s}
|
| // A Scheme list is a series of expressions wrapped in ()
| private def list : Parser[List[Any]] =
| '(' ~> space ~> rep(expr) <~ ')' <~ space ^^ {s: List[Any] => s}
|
| // A Scheme expression contains any of the other constructions
| private def expr : Parser[Any] = id | str | num | bool | list ^^ {case s => s}
|
| def parseExpr(str: String) = parse(expr, str)
| }
defined module SchemeParsers
scala> SchemeParsers.parseExpr("""(("a" "b") ("a" "b"))""")
res12: SchemeParsers.ParseResult[Any] = [1.22] parsed: List(List(a, b), List(a, b))
scala> SchemeParsers.parseExpr("""("a" "b" "c")""")
res13: SchemeParsers.ParseResult[Any] = [1.14] parsed: List(a, b, c)
scala> SchemeParsers.parseExpr("""((1) (1 2) (1 2 3))""")
res14: SchemeParsers.ParseResult[Any] = [1.20] parsed: List(List(1), List(1, 2), List(1, 2, 3))
答案 1 :(得分:1)
代码的唯一问题是您使用字符而不是字符串。下面,我删除了多余的^^ { case s => s }
并用字符串替换了所有字符。我将在下面进一步讨论这个问题。
class SchemeParsers extends RegexParsers {
// Scheme boolean #t and #f translate to Scala's true and false
def bool : Parser[Boolean] =
("#t" | "#f") ^^ {case "#t" => true; case "#f" => false}
// A Scheme identifier allows alphanumeric chars, some symbols, and
// can't start with a digit
def id : Parser[String] =
"""[a-zA-Z=*+/<>!\?][a-zA-Z0-9=*+/<>!\?]*""".r ^^ {case s => s}
// This interpreter only accepts numbers as integers
def num : Parser[Int] = """-?\d+""".r ^^ {case s => s toInt}
// A string can have any character except ", and is wrapped in "
def str : Parser[String] = "\"" ~> """[^""]*""".r <~ "\""
// A Scheme list is a series of expressions wrapped in ()
def list : Parser[List[Any]] =
"(" ~> rep(expr) <~ ")" ^^ {s: List[Any] => s}
// A Scheme expression contains any of the other constructions
def expr : Parser[Any] = id | str | num | bool | list
}
所有Parsers
的{{1}}类型都有隐式accept
。因此,如果基本元素是Elem
,例如Char
,则会对它们进行隐式接受操作,这就是符号RegexParsers
,{{1} }和(
,它们是代码中的字符。
)
自动执行的操作是在任何"
或RegexParsers
的开头自动跳过空格(定义为protected val whiteSpace = """\s+""".r
,以便您可以覆盖)。它还负责在出现错误信息时将定位光标移过空白区域。
您似乎没有意识到的一个结果是String
将从解析的输出中删除其前缀空间,这不太可能是您想要的。 : - )
此外,由于Regex
包含新行,因此在任何标识符之前可以接受新行,这可能是您想要的也可能不是。
您可以通过覆盖" a string beginning with a space"
来禁用整个正则表达式中的空格跳过。另一方面,默认\s
测试skipWhiteSpace
的长度,因此您可以通过在整个解析过程中操纵skipWhiteSpace
的值来打开和关闭它。