python中的小语言

时间:2011-06-14 02:20:14

标签: python parsing programming-languages operators postfix-notation

我正在编写甚至可能在python中称为语言的内容。我目前有几个运营商:+-*^fac@!!fac计算一个阶乘,@返回变量的值,!!设置一个变量。代码如下。我将如何编写一种用这种简单语言定义函数的方法?

编辑:我更新了代码!

import sys, shlex, readline, os, string
List, assign, call, add, sub, div, Pow, mul, mod, fac, duf, read,\
kill, clr, STO, RET, fib, curs = {}, "set", "get", "+", "-", "/", "^", "*",\
"%", "fact", "func", "read", "kill", "clear", ">", "@", "fib", "vars"
def fact(num):
    if num == 1: return 1
    else: return num*fact(num-1)
def Simp(op, num2, num1):
    global List
    try: num1, num2 = float(num1), float(num2)
    except:
        try: num1 = float(num1)
        except:
            try: num2 = float(num2)
            except: pass
    if op == mul: return num1*num2
    elif op == div: return num1/num2
    elif op == sub: return num1-num2
    elif op == add: return num1+num2
    elif op == Pow: return num1**num2
    elif op == assign: List[num1] = num2; return "ok"
    elif op == call: return List[num1]
    elif op == fac: return fact(num1)
    elif op == duf: return "%s %s %s"%(duf, num1, num2)
    elif op == mod: return num1%num2
    elif op == kill: del List[num1]; return "ok"
    elif op == clr: os.system("clear")
    elif op == STO: List[num2] = num1; return "ok"
    elif op == RET: return List[num1]
    elif op == curs: return List
    elif op == read: List[num1] = Eval(raw_input("%s "%num1)); return "ok"
def Eval(expr):
    ops = "%s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s"%(mul, add, sub, div, Pow, assign, call, fac, duf, mod, read, kill, clr, STO, RET, curs)
    stack, expr, ops = [], shlex.split(string.lower(expr)), ops.split()
    for i in expr:
        if i[0] != ';':
            if i not in ops: stack.append(i)
            elif i in ops: stack.append(Simp(i, stack.pop(), stack.pop()))
        else: stack.append("ok")
    return stack[0]
def shell():
    try:
        x = ""
        while x != "quit":
            x = raw_input("star>   ")
            try: l = Eval(x)
            except KeyError: l = "does not exist"
            except: l = "parse error!"
            if l != None: print "   =>",l,"\n"
    except (EOFError, KeyboardInterrupt): print
if len(sys.argv) > 1:
    x = open(sys.argv[1], 'r'); l = x.readlines(); x.close()
    for i in l:
        if i[0] != ";":
            i = ' '.join(i.split())
            x = Eval(i)
            if x != None: print i,"\n   =>",x,"\n"
        else: pass
    shell()
else: shell()

5 个答案:

答案 0 :(得分:7)

您的程序非常混乱,需要先修复它才能修改它以支持定义函数。我将分几步完成这项工作,当我完成它们时,我会将它们添加到答案中。这个答案将会很长。

另外,你显然还没有决定你的语言定义应该是什么。你决定让你的语言定义遵循你的实现技术,这有点破碎,并导致很多痛苦。

首先,您的Simp函数的定义确实被破坏了。它要求所有内容都从堆栈中取出两个值,然后重新放入一个值。这已破了。阶乘函数不能以这种方式工作,Fibonacci函数也不起作用,所以你被迫拥有一个从未使用过的“虚拟”参数。此外,分配给全局列表或字典的元素之类的东西没有理由将值推送到堆栈,因此您只需要按“确定”即可。这已经破裂,需要修复。

以下是修复此问题的版本。请注意,我已将Simp重命名为builtin_op,以便更准确地反映其目的:

import sys, shlex, readline, os, string
List, assign, call, add, sub, div, Pow, mul, mod, fac, duf, read,\
kill, clr, STO, RET, fib, curs = {}, "set", "get", "+", "-", "/", "^", "*",\
"%", "fact", "func", "read", "kill", "clear", ">", "@", "fib", "vars"
def fact(num):
    if num == 1: return 1
    else: return num*fact(num-1)
def builtin_op(op, stack):
    global List
    if op == mul: stack.append(float(stack.pop())*float(stack.pop()))
    elif op == div: stack.append(float(stack.pop())/float(stack.pop()))
    elif op == sub: stack.append(float(stack.pop())-float(stack.pop()))
    elif op == add: stack.append(float(stack.pop())+float(stack.pop()))
    elif op == Pow: stack.append(float(stack.pop())**float(stack.pop()))
    elif op == assign: val = List[stack.pop()] = stack.pop(); stack.append(val)
    elif op == call: stack.append(List[stack.pop()])
    elif op == fac: stack.append(fact(stack.pop()))
    elif op == duf: stack.append("%s %s %s" % (duf, stack.pop(), stack.pop()))
    elif op == mod: stack.append(float(stack.pop())%float(stack.pop()))
    elif op == kill: del List[stack.pop()]
    elif op == clr: os.system("clear")
    elif op == STO: val = List[stack.pop()] = stack.pop(); stack.append(val)
    elif op == RET: stack.append(List[stack.pop()])
    elif op == curs: stack.append(List)
    elif op == read: prompt = stack.pop(); List[prompt] = Eval(raw_input("%s "%prompt)); stack.append(List[prompt])
def Eval(expr):
    ops = "%s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s"%(mul, add, sub, div, Pow, assign, call, fac, duf, mod, read, kill, clr, STO, RET, curs)
    stack, expr, ops = [], shlex.split(string.lower(expr)), ops.split()
    for i in expr:
        if i[0] != ';':
            if i not in ops: stack.append(i)
            elif i in ops: builtin_op(i, stack)
        else: stack.append("ok")
    return stack[0]
def shell():
    try:
        x = ""
        while x != "quit":
            x = raw_input("star>   ")
            try: l = Eval(x)
            except KeyError: l = "does not exist"
            except: l = "parse error!"
            if l != None: print "   =>",l,"\n"
    except (EOFError, KeyboardInterrupt): print
if len(sys.argv) > 1:
    x = open(sys.argv[1], 'r'); l = x.readlines(); x.close()
    for i in l:
        if i[0] != ";":
            i = ' '.join(i.split())
            x = Eval(i)
            if x != None: print i,"\n   =>",x,"\n"
        else: pass
    shell()
else: shell()

这里仍有许多问题没有修复,我将不会修复任何未来的版本。例如,堆栈上的值可能无法解释为浮点数。这将导致异常,并且在从堆栈读取另一个值之前可能抛出此异常。这意味着如果堆栈上存在错误的“类型”,则在“解析错误”之后堆栈可能处于模糊状态。一般来说,你想避免使用某种语言的情况。

定义函数是一个有趣的问题。用您的语言,即时评估。您没有延迟评估的机制,直到稍后。但是,您正在使用shlex模块进行解析。它有一种方式可以说一组完整的字符(包括空格等)是一个实体的一部分。这为我们提供了一种快速简便的实现功能的方法。你可以这样做:

star>   "3 +" add3 func

创建你的功能,并且:

star>   2 add3 get

来称呼它。我使用了get,因为这是您在程序中分配给call的内容。

唯一的问题是该函数需要访问堆栈的当前状态才能工作。您可以轻松地将函数的字符串反馈到Eval,但Eval每次调用时都会创建一个全新的堆栈。为了实现功能,需要修复它。所以我在stack函数中添加了一个默认的Eval参数。如果此参数保留为其默认值,Eval仍将创建一个新堆栈,就像之前一样。但是如果传入现有堆栈,Eval将使用它。

以下是修改后的代码:

import sys, shlex, readline, os, string
List, assign, call, add, sub, div, Pow, mul, mod, fac, duf, read,\
kill, clr, STO, RET, fib, curs = {}, "set", "get", "+", "-", "/", "^", "*",\
"%", "fact", "func", "read", "kill", "clear", ">", "@", "fib", "vars"
funcdict = {}
def fact(num):
    if num == 1: return 1
    else: return num*fact(num-1)
def builtin_op(op, stack):
    global List
    global funcdict
    if op == mul: stack.append(float(stack.pop())*float(stack.pop()))
    elif op == div: stack.append(float(stack.pop())/float(stack.pop()))
    elif op == sub: stack.append(float(stack.pop())-float(stack.pop()))
    elif op == add: stack.append(float(stack.pop())+float(stack.pop()))
    elif op == Pow: stack.append(float(stack.pop())**float(stack.pop()))
    elif op == assign: val = List[stack.pop()] = stack.pop(); stack.append(val)
    elif op == call: Eval(funcdict[stack.pop()], stack)
    elif op == fac: stack.append(fact(stack.pop()))
    elif op == duf: name = stack.pop(); funcdict[name] = stack.pop(); stack.append(name)
    elif op == mod: stack.append(float(stack.pop())%float(stack.pop()))
    elif op == kill: del List[stack.pop()]
    elif op == clr: os.system("clear")
    elif op == STO: val = List[stack.pop()] = stack.pop(); stack.append(val)
    elif op == RET: stack.append(List[stack.pop()])
    elif op == curs: stack.append(List)
    elif op == read: prompt = stack.pop(); List[prompt] = Eval(raw_input("%s "%prompt)); stack.append(List[prompt])
def Eval(expr, stack=None):
    ops = "%s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s"%(mul, add, sub, div, Pow, assign, call, fac, duf, mod, read, kill, clr, STO, RET, curs)
    if stack is None:
        stack = []
    expr, ops = shlex.split(string.lower(expr)), ops.split()
    for i in expr:
        if i[0] != ';':
            if i not in ops: stack.append(i)
            elif i in ops: builtin_op(i, stack)
        else: stack.append("ok")
    return stack[0]
def shell():
    try:
        x = ""
        while x != "quit":
            x = raw_input("star>   ")
            try: l = Eval(x)
            except KeyError: l = "does not exist"
            except: l = "parse error!"
            if l != None: print "   =>",l,"\n"
    except (EOFError, KeyboardInterrupt): print
if len(sys.argv) > 1:
    x = open(sys.argv[1], 'r'); l = x.readlines(); x.close()
    for i in l:
        if i[0] != ";":
            i = ' '.join(i.split())
            x = Eval(i)
            if x != None: print i,"\n   =>",x,"\n"
        else: pass
    shell()
else: shell()

在基于堆栈的语言中,两个非常有用的内置运算符是dupswapdup获取顶部堆栈元素并复制它。 swap交换前两个堆栈元素。

如果您有dup,则可以实现square功能,如下所示:

star>   "dup *" square func

以下是您的计划dupswap已实施:

import sys, shlex, readline, os, string
List, assign, call, add, sub, div, Pow, mul, mod, fac, duf, read,\
kill, clr, STO, RET, fib, curs, dup, swap = {}, "set", "get", "+", "-", "/", "^", "*",\
"%", "fact", "func", "read", "kill", "clear", ">", "@", "fib", "vars", "dup", "swap"
funcdict = {}
def fact(num):
    if num == 1: return 1
    else: return num*fact(num-1)
def builtin_op(op, stack):
    global List
    global funcdict
    if op == mul: stack.append(float(stack.pop())*float(stack.pop()))
    elif op == div: stack.append(float(stack.pop())/float(stack.pop()))
    elif op == sub: stack.append(float(stack.pop())-float(stack.pop()))
    elif op == add: stack.append(float(stack.pop())+float(stack.pop()))
    elif op == Pow: stack.append(float(stack.pop())**float(stack.pop()))
    elif op == assign: val = List[stack.pop()] = stack.pop(); stack.append(val)
    elif op == call: Eval(funcdict[stack.pop()], stack)
    elif op == fac: stack.append(fact(stack.pop()))
    elif op == duf: name = stack.pop(); funcdict[name] = stack.pop(); stack.append(name)
    elif op == mod: stack.append(float(stack.pop())%float(stack.pop()))
    elif op == kill: del List[stack.pop()]
    elif op == clr: os.system("clear")
    elif op == STO: val = List[stack.pop()] = stack.pop(); stack.append(val)
    elif op == RET: stack.append(List[stack.pop()])
    elif op == curs: stack.append(List)
    elif op == dup: val = stack.pop(); stack.append(val); stack.append(val)
    elif op == swap: val1 = stack.pop(); val2 = stack.pop(); stack.append(val1); stack.append(val2)
    elif op == read: prompt = stack.pop(); List[prompt] = Eval(raw_input("%s "%prompt)); stack.append(List[prompt])
def Eval(expr, stack=None):
    ops = "%s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s %s"%(mul, add, sub, div, Pow, assign, call, fac, duf, mod, read, kill, clr, STO, RET, curs, dup, swap)
    if stack is None:
        stack = []
    expr, ops = shlex.split(string.lower(expr)), ops.split()
    for i in expr:
        if i[0] != ';':
            if i not in ops: stack.append(i)
            elif i in ops: builtin_op(i, stack)
        else: stack.append("ok")
    return stack[0]
def shell():
    try:
        x = ""
        while x != "quit":
            x = raw_input("star>   ")
            try: l = Eval(x)
            except KeyError: l = "does not exist"
            except: l = "parse error!"
            if l != None: print "   =>",l,"\n"
    except (EOFError, KeyboardInterrupt): print
if len(sys.argv) > 1:
    x = open(sys.argv[1], 'r'); l = x.readlines(); x.close()
    for i in l:
        if i[0] != ";":
            i = ' '.join(i.split())
            x = Eval(i)
            if x != None: print i,"\n   =>",x,"\n"
        else: pass
    shell()
else: shell()

最后,这是我在Python中的版本(我认为无论如何)比你编写的Python更清晰:

import shlex, functools, sys, StringIO

def bin_numeric_op(func):
    @functools.wraps(func)
    def execute(self):
        n2, n1 = self._stack.pop(), self._stack.pop()
        n1 = float(n1)
        n2 = float(n2)
        self._stack.append(func(n1, n2))
    return execute

def relational_op(func):
    @functools.wraps(func)
    def execute(self):
        n2, n1 = self._stack.pop(), self._stack.pop()
        self._stack.append(bool(func(n1, n2)))
    return execute

def bin_bool_op(func):
    @functools.wraps(func)
    def execute(self):
        n2, n1 = self._stack.pop(), self._stack.pop()
        n1 = bool(n1)
        n2 = bool(n2)
        self._stack.append(bool(func(n1, n2)))
    return execute

class Interpreter(object):
    def __init__(self):
        self._stack = []
        self._vars = {}
        self._squarestack = []

    def processToken(self, token):
        if token == '[':
            self._squarestack.append(len(self._stack))
        # Currently inside square brackets, don't execute
        elif len(self._squarestack) > 0:
            if token == ']':
                startlist = self._squarestack.pop()
                lst = self._stack[startlist:]
                self._stack[startlist:] = [tuple(lst)]
            else:
                self._stack.append(token)
        # Not current inside list and close square token, something's wrong.
        elif token == ']':
            raise ValueError("Unmatched ']'")
        elif token in self.builtin_ops:
            self.builtin_ops[token](self)
        else:
            self._stack.append(token)
    def get_stack(self):
        return self._stack
    def get_vars(self):
        return self._vars
    @bin_numeric_op
    def add(n1, n2):
        return n1 + n2
    @bin_numeric_op
    def mul(n1, n2):
        return n1 * n2
    @bin_numeric_op
    def div(n1, n2):
        return n1 / n2
    @bin_numeric_op
    def sub(n1, n2):
        return n1 - n2
    @bin_numeric_op
    def mod(n1, n2):
        return n1 % n2
    @bin_numeric_op
    def Pow(n1, n2):
        return n1**n2
    @relational_op
    def less(v1, v2):
        return v1 < v2
    @relational_op
    def lesseq(v1, v2):
        return v1 <= v2
    @relational_op
    def greater(v1, v2):
        return v1 > v2
    @relational_op
    def greatereq(v1, v2):
        return v1 > v2
    @relational_op
    def isequal(v1, v2):
        return v1 == v2
    @relational_op
    def isnotequal(v1, v2):
        return v1 != v2
    @bin_bool_op
    def bool_and(v1, v2):
        return v1 and v2
    @bin_bool_op
    def bool_or(v1, v2):
        return v1 or v2
    def bool_not(self):
        stack = self._stack
        v1 = stack.pop()
        stack.append(not v1)
    def if_func(self):
        stack = self._stack
        pred = stack.pop()
        code = stack.pop()
        if pred:
            self.run(code)
    def ifelse_func(self):
        stack = self._stack
        pred = stack.pop()
        nocode = stack.pop()
        yescode = stack.pop()
        code = yescode if pred else nocode
        self.run(code)
    def store(self):
        stack = self._stack
        value = stack.pop()
        varname = stack.pop()
        self._vars[varname] = value
    def fetch(self):
        stack = self._stack
        varname = stack.pop()
        stack.append(self._vars[varname])
    def remove(self):
        varname = self._stack.pop()
        del self._vars[varname]
    # The default argument is because this is used internally as well.
    def run(self, code=None):
        if code is None:
            code = self._stack.pop()
        for tok in code:
            self.processToken(tok)
    def dup(self):
        self._stack.append(self._stack[-1])
    def swap(self):
        self._stack[-2:] = self._stack[-1:-3:-1]
    def pop(self):
        self._stack.pop()
    def showstack(self):
        print"%r" % (self._stack,)
    def showvars(self):
        print "%r" % (self._vars,)
    builtin_ops = {
        '+': add,
        '*': mul,
        '/': div,
        '-': sub,
        '%': mod,
        '^': Pow,
        '<': less,
        '<=': lesseq,
        '>': greater,
        '>=': greatereq,
        '==': isequal,
        '!=': isnotequal,
        '&&': bool_and,
        '||': bool_or,
        'not': bool_not,
        'if': if_func,
        'ifelse': ifelse_func,
        '!': store,
        '@': fetch,
        'del': remove,
        'call': run,
        'dup': dup,
        'swap': swap,
        'pop': pop,
        'stack': showstack,
        'vars': showvars
        }

def shell(interp):
    try:
        while True:
            x = raw_input("star>   ")
            msg = None
            try:
                interp.run(shlex.split(x))
            except KeyError:
                msg = "does not exist"
            except:
                sys.excepthook(*sys.exc_info())
                msg = "parse error!"
            if msg != None:
                print "   =>",msg,"\n"
            else:
                print "   => %r\n" % (interp.get_stack(),)
    except (EOFError, KeyboardInterrupt):
        print

interp = Interpreter()
if len(sys.argv) > 1:
    lex = shlex.shlex(open(sys.argv[1], 'r'), sys.argv[1])
    tok = shlex.get_token()
    while tok is not None:
        interp.processToken(tok)
        tok = lex.get_token()
shell(interp)

此新版本支持ififelse语句。通过this和函数调用,可以在语言中实现fibfact函数。我将在后面添加如何定义它们。

以下是定义fib函数的方法:

star>   fib [ dup [ pop 1 0 + ] swap [ dup 1 - fib @ call swap 2 - fib @ call + ] swap 0 + 2 0 + < ifelse ] !
   => []

star>   15 fib @ call
   => [987.0]

0 + 2 0 +之前的<序列是强制比较为数字比较。

另请注意[]单个字符如何引用运算符。它们导致它们之间的所有内容都不会被执行,而是作为单个项目列表存储在堆栈中。这是定义函数的关键。函数是一系列令牌,您可以使用call运算符执行这些令牌。它们也可以用于“匿名块”,它们是lambda表达式和标准Python块之间的交叉。这些在fib函数中用于ifelse语句的两个可能路径。

解析器非常简单。 shlex足以作为这种简单语言的词法分析器。其他项目将从列表中提取单个项目。创建仅包含先前列表的一部分的新列表。 “弄明白”堆栈中的单个令牌。实现while原语。对整数进行操作的数值运算符(在实际的Forth中,默认情况下,数值运算对整数进行操作,您需要指定+.之类的东西来获得浮点版本)。以及允许字符串操作的符号标记的一些操作。也许“拆分”和“连接”操作将令牌转换为字符的单个令牌列表或将列表连接成单个令牌就足够了。

答案 1 :(得分:2)

正确的答案取决于你担心的是什么。如果您担心具有可扩展的解决方案,语言复杂性会增加,您可能应该开始学习/使用其中一个解析器模块。如果您担心性能,这可能是一个答案,因为某些模块可能比您手动生成的模块更好地进行优化。

另一方面,如果您感兴趣的是学习,请查看shunting yard algorithm。您可以使用以下操作创建函数字典(比if语句更快):

funcs = {}
funcs["+"] = lambda x, y: x + y
funcs["*"] = lambda x, y: y * y

然后在您的Simp功能中,您可以拨打

func = funcs.get[Op]
if func is not None:
    func[Op](num1,num2)
else:
    #whatever you want to do here

答案 2 :(得分:1)

您需要的是将符号序列(数字,数字上的操作,括号)转换为树结构,表示由符号序列表示的计算。这样的事情正是“解析器”的工作'你可能想看看像这样的简单解析器http://en.wikipedia.org/wiki/LL_parser 这些代码很简单,你可以用铅笔和纸来计算解析表。

答案 3 :(得分:1)

看起来你正试图在python中写这样的Forth

答案 4 :(得分:1)

您可以使用字典存储变量并将其与函数名称关联。

例如,假设您正在逐行阅读代码:

a = 1
b = 2
c = a + b
function x() 
   d = 4
   e = 5
   f = d + e 
end

当你发现变量(a,b,c)时,你将它们存储在一个列表和一个范围内的列表中,这可能是全局范围内的一些东西:

variables = scopes["global"]
variables.append( "a" ) 

您可以为函数设置类似的数据结构,因此当您发现函数时,将其添加到该结构中:

fs = functions["global"]
fs.append("x")

您还可以在范围字典中添加新的“范围”

scopes["x"] = [] // the function x doesn't have any var 

如果您在函数定义中找到新的var ,则将该var存储在该“scope”中

variables = scopes["x"]
variables.append("d")

有意义吗?

在您当前的实施中,所有这些都应该是可行的。如果你想做更严肃的事情我强烈建议你买这本书

http://pragprog.com/titles/tpdsl/language-implementation-patterns

即使它是以Java为例编写的,它也会为您提供坚实的基础语言应用程序,并且它非常易于阅读。

然后你应该有工具:

  1. 创建Lexer(将输入拆分为令牌)
  2. Parse(验证令牌是否符合您的语言规则)
  3. AST(接受你的输入)
  4. 识别程序符号(如变量和函数)
  5. 建立翻译。
  6. 我希望这会有所帮助