我在ANSI文本文件中以前缀表示法给出了一堆表达式。我想生成另一个ANSI文本文件,其中包含对这些表达式的逐步评估。例如:
- + ^ x 2 ^ y 2 1
应该变成
t1 = x^2
t2 = y^2
t3 = t1 + t2
t4 = t3 - 1
t4 is the result
我还必须识别常见的子表达式。例如给出
expression_1: z = ^ x 2
expression_2: - + z ^ y 2 1
expression_3: - z y
我必须生成一个输出,说x出现在表达式1,2和3(通过z)中。
我必须识别依赖:expression_1仅依赖于x,expression_2依赖于x和y等。
原始问题比上面的例子更困难,我无法控制输入格式,它以前缀表示法比上述方法复杂得多。
我已经在C ++中有一个可行的实现,但是在C ++中做这些事情是很痛苦的。
哪种编程语言最适合这些类型的问题?
你能推荐我可以开始的教程/网站/书吗?
我应该寻找哪些关键字?
更新:根据答案,上面的例子有点不幸,我在输入中有一元,二元和n元运算符。 (如果您想知道,exp
是一元运算符,范围内的sum
是一个n-ary运算符。)
答案 0 :(得分:5)
为了让您了解这在Python中的样子,下面是一些示例代码:
operators = "+-*/^"
def parse(it, count=1):
token = next(it)
if token in operators:
op1, count = parse(it, count)
op2, count = parse(it, count)
tmp = "t%s" % count
print tmp, "=", op1, token, op2
return tmp, count + 1
return token, count
s = "- + ^ x 2 ^ y 2 1"
a = s.split()
res, dummy = parse(iter(a))
print res, "is the result"
输出与示例输出相同。
除了这个例子,我认为你列出的任何高级语言几乎都适合这项任务。
答案 1 :(得分:4)
sympy
python包执行符号代数,包括公共子表达式消除和为一组表达式生成评估步骤。
请参阅:http://docs.sympy.org/dev/modules/rewriting.html(请参阅页面底部的cse
方法。)
答案 2 :(得分:3)
Python示例非常简短,但我怀疑你实际上并没有对这些表达式进行足够的控制。实际构建表达式树要好得多,即使它需要更多工作,然后查询树。这是Scala中的一个示例(适用于剪切和粘贴到REPL中):
object OpParser {
private def estr(oe: Option[Expr]) = oe.map(_.toString).getOrElse("_")
case class Expr(text: String, left: Option[Expr] = None, right: Option[Expr] = None) {
import Expr._
def varsUsed: Set[String] = text match {
case Variable(v) => Set(v)
case Op(o) =>
left.map(_.varsUsed).getOrElse(Set()) ++ right.map(_.varsUsed).getOrElse(Set())
case _ => Set()
}
def printTemp(n: Int = 0, depth: Int = 0): (String,Int) = text match {
case Op(o) =>
val (tl,nl) = left.map(_.printTemp(n,depth+1)).getOrElse(("_",n))
val (tr,nr) = right.map(_.printTemp(nl,depth+1)).getOrElse(("_",n))
val t = "t"+(nr+1)
println(t + " = " + tl + " " + text + " " + tr)
if (depth==0) println(t + " is the result")
(t, nr+1)
case _ => (text, n)
}
override def toString: String = {
if (left.isDefined || right.isDefined) {
"(" + estr(left) + " " + text + " " + estr(right) + ")"
}
else text
}
}
object Expr {
val Digit = "([0-9]+)"r
val Variable = "([a-z])"r
val Op = """([+\-*/^])"""r
def parse(s: String) = {
val bits = s.split(" ")
val parsed = (
if (bits.length > 2 && Variable.unapplySeq(bits(0)).isDefined && bits(1)=="=") {
parseParts(bits,2)
}
else parseParts(bits)
)
parsed.flatMap(p => if (p._2<bits.length) None else Some(p._1))
}
def parseParts(as: Array[String], from: Int = 0): Option[(Expr,Int)] = {
if (from >= as.length) None
else as(from) match {
case Digit(n) => Some(Expr(n), from+1)
case Variable(v) => Some(Expr(v), from+1)
case Op(o) =>
parseParts(as, from+1).flatMap(lhs =>
parseParts(as, lhs._2).map(rhs => (Expr(o,Some(lhs._1),Some(rhs._1)), rhs._2))
)
case _ => None
}
}
}
}
这可能会有点消化所有的一次,但再次,这确实相当多。
首先,它完全是防弹的(注意大量使用Option
结果可能会失败)。如果你向它扔垃圾,它只会返回None
。 (通过更多的工作,你可以让它以一种信息丰富的方式抱怨问题 - 基本上case Op(o)
然后嵌套两次parseParts
可以存储结果并打印出信息性的错误消息如果op没有得到两个参数。同样,parse
可以抱怨尾随值而不是仅仅丢弃None
。)
其次,当你完成它时,你有一个完整的表达式树。请注意,printTemp
打印出您想要的临时变量,varsUsed
列出特定表达式中使用的变量,一旦解析多行,您可以使用它们扩展为完整列表。 (如果您的变量不仅仅是a
到z
,您可能需要稍微调整正则表达式。)另请注意,表达式树以正常的中缀表示法打印出来。让我们看一些例子:
scala> OpParser.Expr.parse("4")
res0: Option[OpParser.Expr] = Some(4)
scala> OpParser.Expr.parse("+ + + + + 1 2 3 4 5 6")
res1: Option[OpParser.Expr] = Some((((((1 + 2) + 3) + 4) + 5) + 6))
scala> OpParser.Expr.parse("- + ^ x 2 ^ y 2 1")
res2: Option[OpParser.Expr] = Some((((x ^ 2) + (y ^ 2)) - 1))
scala> OpParser.Expr.parse("+ + 4 4 4 4") // Too many 4s!
res3: Option[OpParser.Expr] = None
scala> OpParser.Expr.parse("Q#$S!M$#!*)000") // Garbage!
res4: Option[OpParser.Expr] = None
scala> OpParser.Expr.parse("z =") // Assigned nothing?!
res5: Option[OpParser.Expr] = None
scala> res2.foreach(_.printTemp())
t1 = x ^ 2
t2 = y ^ 2
t3 = t1 + t2
t4 = t3 - 1
t4 is the result
scala> res2.map(_.varsUsed)
res10: Option[Set[String]] = Some(Set(x, y))
现在,您可以在Python中执行此操作,而无需额外的工作,以及其他许多语言。我更喜欢使用Scala,但您可能更喜欢使用Scala。无论如何,如果您希望保留最大的灵活性来处理棘手的案例,我建议您创建完整的表达式树。
答案 3 :(得分:2)
使用普通递归解析器,前缀表示法非常简单。例如:
object Parser {
val Subexprs = collection.mutable.Map[String, String]()
val Dependencies = collection.mutable.Map[String, Set[String]]().withDefaultValue(Set.empty)
val TwoArgsOp = "([-+*/^])".r // - at the beginning, ^ at the end
val Ident = "(\\p{Alpha}\\w*)".r
val Literal = "(\\d+)".r
var counter = 1
def getIdent = {
val ident = "t" + counter
counter += 1
ident
}
def makeOp(op: String) = {
val op1 = expr
val op2 = expr
val ident = getIdent
val subexpr = op1 + " " + op + " " + op2
Subexprs(ident) = subexpr
Dependencies(ident) = Dependencies(op1) ++ Dependencies(op2) + op1 + op2
ident
}
def expr: String = nextToken match {
case TwoArgsOp(op) => makeOp(op)
case Ident(id) => id
case Literal(lit) => lit
case x => error("Unknown token "+x)
}
def nextToken = tokens.next
var tokens: Iterator[String] = _
def parse(input: String) = {
tokens = input.trim split "\\s+" toIterator;
counter = 1
expr
if (tokens.hasNext)
error("Input not fully parsed: "+tokens.mkString(" "))
(Subexprs, Dependencies)
}
}
这将生成如下输出:
scala> val (subexpressions, dependencies) = Parser.parse("- + ^ x 2 ^ y 2 1")
subexpressions: scala.collection.mutable.Map[String,String] = Map(t3 -> t1 + t2, t4 -> t3 - 1, t1 -> x ^ 2, t2 -> y ^ 2)
dependencies: scala.collection.mutable.Map[String,Set[String]] = Map(t3 -> Set(x, y, t2, 2, t1), t4 -> Set(x, y, t3, t2, 1, 2, t1), t1 -> Set(x, 2), t
2 -> Set(y, 2))
scala> subexpressions.toSeq.sorted foreach println
(t1,x ^ 2)
(t2,y ^ 2)
(t3,t1 + t2)
(t4,t3 - 1)
scala> dependencies.toSeq.sortBy(_._1) foreach println
(t1,Set(x, 2))
(t2,Set(y, 2))
(t3,Set(x, y, t2, 2, t1))
(t4,Set(x, y, t3, t2, 1, 2, t1))
这很容易扩展。例如,要处理多个表达式语句,可以使用:
object Parser {
val Subexprs = collection.mutable.Map[String, String]()
val Dependencies = collection.mutable.Map[String, Set[String]]().withDefaultValue(Set.empty)
val TwoArgsOp = "([-+*/^])".r // - at the beginning, ^ at the end
val Ident = "(\\p{Alpha}\\w*)".r
val Literal = "(\\d+)".r
var counter = 1
def getIdent = {
val ident = "t" + counter
counter += 1
ident
}
def makeOp(op: String) = {
val op1 = expr
val op2 = expr
val ident = getIdent
val subexpr = op1 + " " + op + " " + op2
Subexprs(ident) = subexpr
Dependencies(ident) = Dependencies(op1) ++ Dependencies(op2) + op1 + op2
ident
}
def expr: String = nextToken match {
case TwoArgsOp(op) => makeOp(op)
case Ident(id) => id
case Literal(lit) => lit
case x => error("Unknown token "+x)
}
def assignment: Unit = {
val ident = nextToken
nextToken match {
case "=" =>
val tmpIdent = expr
Dependencies(ident) = Dependencies(tmpIdent)
Subexprs(ident) = Subexprs(tmpIdent)
Dependencies.remove(tmpIdent)
Subexprs.remove(tmpIdent)
case x => error("Expected assignment, got "+x)
}
}
def stmts: Unit = while(tokens.hasNext) tokens.head match {
case TwoArgsOp(_) => expr
case Ident(_) => assignment
case x => error("Unknown statement starting with "+x)
}
def nextToken = tokens.next
var tokens: BufferedIterator[String] = _
def parse(input: String) = {
tokens = (input.trim split "\\s+" toIterator).buffered
counter = 1
stmts
if (tokens.hasNext)
error("Input not fully parsed: "+tokens.mkString(" "))
(Subexprs, Dependencies)
}
}
产量:
scala> val (subexpressions, dependencies) = Parser.parse("""
| z = ^ x 2
| - + z ^ y 2 1
| - z y
| """)
subexpressions: scala.collection.mutable.Map[String,String] = Map(t3 -> z + t2, t5 -> z - y, t4 -> t3 - 1, z -> x ^ 2, t2 -> y ^ 2)
dependencies: scala.collection.mutable.Map[String,Set[String]] = Map(t3 -> Set(x, y, t2, 2, z), t5 -> Set(x, 2, z, y), t4 -> Set(x, y, t3, t2, 1, 2, z
), z -> Set(x, 2), t2 -> Set(y, 2))
scala> subexpressions.toSeq.sorted foreach println
(t2,y ^ 2)
(t3,z + t2)
(t4,t3 - 1)
(t5,z - y)
(z,x ^ 2)
scala> dependencies.toSeq.sortBy(_._1) foreach println
(t2,Set(y, 2))
(t3,Set(x, y, t2, 2, z))
(t4,Set(x, y, t3, t2, 1, 2, z))
(t5,Set(x, 2, z, y))
(z,Set(x, 2))
答案 4 :(得分:2)
好的,因为递归解析器不是你的东西,所以这里有一个解析组合器的替代方案:
object PrefixParser extends JavaTokenParsers {
import scala.collection.mutable
// Maps generated through parsing
val Subexprs = mutable.Map[String, String]()
val Dependencies = mutable.Map[String, Set[String]]().withDefaultValue(Set.empty)
// Initialize, read, parse & evaluate string
def read(input: String) = {
counter = 1
Subexprs.clear
Dependencies.clear
parseAll(stmts, input)
}
// Grammar
def stmts = stmt+
def stmt = assignment | expr
def assignment = (ident <~ "=") ~ expr ^^ assignOp
def expr: P = subexpr | identifier | number
def subexpr: P = twoArgs | nArgs
def twoArgs: P = operator ~ expr ~ expr ^^ twoArgsOp
def nArgs: P = "sum" ~ ("\\d+".r >> args) ^^ nArgsOp
def args(n: String): Ps = repN(n.toInt, expr)
def operator = "[-+*/^]".r
def identifier = ident ^^ (id => Result(id, Set(id)))
def number = wholeNumber ^^ (Result(_, Set.empty))
// Evaluation helper class and types
case class Result(ident: String, dependencies: Set[String])
type P = Parser[Result]
type Ps = Parser[List[Result]]
// Evaluation methods
def assignOp: (String ~ Result) => Result = {
case ident ~ result =>
val value = assign(ident,
Subexprs(result.ident),
result.dependencies - result.ident)
Subexprs.remove(result.ident)
Dependencies.remove(result.ident)
value
}
def assign(ident: String,
value: String,
dependencies: Set[String]): Result = {
Subexprs(ident) = value
Dependencies(ident) = dependencies
Result(ident, dependencies)
}
def twoArgsOp: (String ~ Result ~ Result) => Result = {
case op ~ op1 ~ op2 => makeOp(op, op1, op2)
}
def makeOp(op: String,
op1: Result,
op2: Result): Result = {
val ident = getIdent
assign(ident,
"%s %s %s" format (op1.ident, op, op2.ident),
op1.dependencies ++ op2.dependencies + ident)
}
def nArgsOp: (String ~ List[Result]) => Result = {
case op ~ ops => makeNOp(op, ops)
}
def makeNOp(op: String, ops: List[Result]): Result = {
val ident = getIdent
assign(ident,
"%s(%s)" format (op, ops map (_.ident) mkString ", "),
ops.foldLeft(Set(ident))(_ ++ _.dependencies))
}
var counter = 1
def getIdent = {
val ident = "t" + counter
counter += 1
ident
}
// Debugging helper methods
def printAssignments = Subexprs.toSeq.sorted foreach println
def printDependencies = Dependencies.toSeq.sortBy(_._1) map {
case (id, dependencies) => (id, dependencies - id)
} foreach println
}
这是你得到的结果:
scala> PrefixParser.read("""
| z = ^ x 2
| - + z ^ y 2 1
| - z y
| """)
res77: PrefixParser.ParseResult[List[PrefixParser.Result]] = [5.1] parsed: List(Result(z,Set(x)), Result(t4,Set(t4, y, t3, t2, z)), Result(t5,Set(z, y
, t5)))
scala> PrefixParser.printAssignments
(t2,y ^ 2)
(t3,z + t2)
(t4,t3 - 1)
(t5,z - y)
(z,x ^ 2)
scala> PrefixParser.printDependencies
(t2,Set(y))
(t3,Set(z, y, t2))
(t4,Set(y, t3, t2, z))
(t5,Set(z, y))
(z,Set(x))
n-Ary运算符
scala> PrefixParser.read("""
| x = sum 3 + 1 2 * 3 4 5
| * x x
| """)
res93: PrefixParser.ParseResult[List[PrefixParser.Result]] = [4.1] parsed: List(Result(x,Set(t1, t2)), Result(t4,Set(x, t4)))
scala> PrefixParser.printAssignments
(t1,1 + 2)
(t2,3 * 4)
(t4,x * x)
(x,sum(t1, t2, 5))
scala> PrefixParser.printDependencies
(t1,Set())
(t2,Set())
(t4,Set(x))
(x,Set(t1, t2))
答案 5 :(得分:1)
答案 6 :(得分:1)
问题包括两个子问题:解析和符号操作。在我看来,答案归结为两种可能的解决方案。
一个是从头开始实现所有内容:“如果您希望保留最大的灵活性来处理棘手的案例,我建议您创建完整的表达式树。” - 由雷克斯提出。正如Sven所指出的那样:“你列出的任何高级语言几乎都适合这项任务,”然而,“Python(或你列出的任何高级语言)都不会消除问题的复杂性。 “
我在Scala中收到了非常好的解决方案(非常感谢Rex和Daniel),这是Python中的一个很好的例子(来自Sven)。但是,我仍然对Lisp,Haskell或Erlang解决方案感兴趣。
另一种解决方案是使用一些现有的库/软件来完成任务,具有所有隐含的优缺点。候选人是Maxima(Common Lisp),SymPy(Python,由payne提出)和GiNaC(C ++)。