Question

我正在尝试创建一个程序，它可以从文件中读取代码（类似于interpretor但不复杂）将其存储到线性列表中，然后在需要时执行。

这就是我希望这位“解释者”知道的事情：

variable declaration
array declaration
for, while, if control structures 
input and output files (output files can be opened in append mode)
non recursive functions

因为我按命令执行命令我可以在执行的特定数量的命令中停止/启动，这在尝试同时执行更多操作时很有用（类似于multithreading），因此，我将将班级命名为_MINI_THREAD。以下是struct COMMAND和class _MINI_THREAD：

的声明

struct COMMAND
{
    unsigned int ID;
    char command_text[151];
    COMMAND* next;
    COMMAND* prev;
};
class _MINI_THREAD
{
public:
    void allocate_memory()
    {
        while (start_of_list == NULL)
            start_of_list = new (std::nothrow) COMMAND;
        while (end_of_list == NULL)
            end_of_list = new (std::nothrow) COMMAND;
        start_of_list -> prev = NULL;
        start_of_list -> next = end_of_list;
        end_of_list -> prev = start_of_list;
        end_of_list -> next = NULL;
    }
    void free_memory()
    {
        for(COMMAND* i=start_of_list -> next;i!=end_of_list;i=i->next)
            delete i -> prev;
        delete end_of_list -> prev;
        delete end_of_list;
    }
    bool execute_command(unsigned int number_of_commands)
    {
        for(unsigned int i=0;i<number_of_commands;i++)
        {
             /*match id of current command pointed by the cursor with a function from the map*/
             if (cursor==end_of_list) return false;
             else cursor=cursor->next;
        } 
        return true;
    }
    bool if_finished()
    {
        if (cursor==end_of_list)return true;
        else return false;
    }
    unsigned int get_ticks()
    {
        return ticks_per_loop;
    }
    void set_ticks(unsigned int ticks)
    {
        ticks_per_loop = ticks;
    }
private:
    unsigned int ticks_per_loop;
    COMMAND* cursor=NULL;
    COMMAND* start_of_list=NULL;
    COMMAND* end_of_list=NULL;
};

我还尝试将源文件中“发明代码”的语法尽可能地保持在c / c ++语法中，但有时我会放置一个新参数，因为它使验证更容易。请注意，即使while有一个名称，我也可以更快地管理嵌套循环。

这是我提出的一个例子：

Source_file.txt

int a;
input_file fin ("numbers.in");
output_file fout ("numbers.out");
while loop_one ( fin.read(a,int,skipws) )
{
    fout.print(a,int);
    fout.print(32,char); /*prints a space after each number*/
}
close_input_file fin;
close_output_file fout;

/*This code is supposed to take all numbers from the input file and */
/* move them into the output file */

在真实程序中，类thread1的对象_MINI_THREAD包含一个动态分配的列表（我会将其显示为一个数组以便于理解）

_MINI_THREAD thread1;
/*read from Source_file.txt each command into thread1 command array*/
thread1.commandarr={
                    define_integer("a"),
                    open_input_file("numbers.in",fin),
                    open_output_file("numbers.out",fout),
                    define_label_while("loop_one",fin.read()),  /*if the condition is false the `cursor` will jump to labe_while_end*/
                    type_to_file(fout,a,int),
                    type_to_file(fout,32,char),
                    label_while_return("loop_one"), /*returns the cursor to the first line after the while declaration*/
                    label_while_end("loop_one"), /*marks the line after the while return point*/
                    close_input_file("numbers.in",fin),
                    close_output_file("numbers.out",fout),
                   };
/*the cursor is already pointing at the first command (define_integer("a"))*/
/*this will execute commands until the cursor reaches the end_of_list*/
while(thread1.execute_commands(1))NULL; 
thread1.free_memory();

现在我的问题实际上是实施IF_CONTROL_STRUCTURE。因为您可能想要输入if (a==b)或if (foo())等...而且我不知道如何测试所有这些内容。

我设法将cursor移动到任何结构（while，do ... while，for等）labels，但仍然我无法检查每个结构的状况。

Answer 1

你真的想写一些interpreter（可能使用一些bytecode）。详细了解semantics。

写好一个好的翻译并不是一项微不足道的任务。考虑使用一些现有的，例如Lua，Guile，Neko，Python，Ocaml，....并花些时间研究他们的免费软件实施。

否则，花几个月阅读内容，特别是：

SICP绝对必须阅读（并且可免费下载）。
Dragon Book
Programming Language Pragmatics
Lisp In Small Pieces
关于SECD machine
GC handbook

请注意，需要整本书（至少）来解释口译员的工作方式。另请参阅相关的SIGPLAN会议。

许多（多线程友好）解释器有一些GIL。一个真正的多线程解释器（没有任何GIL）很难设计（究竟是它的REPL ???），而且多线程垃圾收集器也很难实现和调试（考虑使用现有的，可能是MPS或Boehm's GC）。

所以“你的简单工作”可能需要几年的全职工作（并且可以让你获得博士学位）。

更简单的方法

在阅读SICP并熟悉某些Lisp之类的语言（可能是某些Scheme，例如通过Guile）后，您可以决定使用某种更简单的方法（基本上是一个很小的Lisp解释器，你可以用几百行C ++编写代码;不像之前提到的那些完整的解释器那样严重。）

您首先需要在纸上定义，至少用英语定义脚本语言的syntax和semantics。从Lisp及其S-expressions中获取灵感。您可能希望脚本语言为homoiconic（因此您的AST将是您的语言的值），并且它（如Lisp）只有表达式（并且没有语句））。所以条件是三元的，比如C ++ ? :

您可以将脚本语言的AST表示为某些C ++数据结构（可能只有一些虚拟方法的class）。 Parsing将一些脚本文件放入AST（或AST序列，可能是某些REPL的源代码）是如此经典，我甚至都不会解释;你可以使用一些解析器生成器 - 不称为compiler-compilers（如bison或lemon）。

然后，您至少会实现一些eval功能。它需要两个参数 Exp 和 Env ：第一个， Exp ，是要评估的表达式的AST，第二个， Env 是一些绑定环境（定义脚本语言的局部变量的绑定，它可以像从变量到值的映射堆栈一样简单）。并且eval函数返回一些值。它可能是AST类的成员函数（然后 Exp 是this，接收者....）。当然，脚本语言的AST和值是tagged union（如果愿意的话，您可以将其表示为类层次结构）。

在C ++中递归地实现这样的eval非常简单。这是一些伪代码：

eval Exp Env :
  if (Exp is some constant) {
     return that constant }
  if (Exp is a variable Var) {
     return the bounded value of that Var in Env }
  if (Exp is some primitive binary operator Op /* like + */
      with operands Exp1 Exp2) {
     compute V1 = eval Exp1 Env
     and V2 = Exp2 Env
     return the application of Op /* eg addition */ on V1 and V2
  }
  if (Exp is a conditional If Exp1 Exp2 Exp3) {
     compute V1 = eval Exp1 Env
     if (V1 is true) {
       compute V2 = eval Exp2 Env
       return V2
     } else { /*V1 is false*/
       compute V3 = eval Exp3 Env
       return V3
     }
  }
  .... etc....

还有许多其他需要考虑的案例（例如，某些While，某些Let或LetRec可能会增加 Env，原始操作不同的arities，Apply任意函数值的MINI_THREAD对某些参数序列等...）留给读者的练习。当然，一些表达式在评估时会有side effects。

SICP 和 Lisp In Small Pieces 都很好地解释了这个想法。阅读meta-circular evaluators。没有读过SICP就不要编码...

你问题中的代码块是一个设计错误（即使library(dplyr) #add column names which you don't want to convert to long data ignore_cols <- c("ID", "another_col") df %>% select(-one_of(ignore_cols)) %>% mutate(AgeRange = names(.)[max.col(.)]) %>% select(AgeRange) %>% bind_cols(df[,ignore_cols])是错误的）。花几周时间阅读更多信息，将代码丢给捶打垃圾箱，然后重新开始。请务必使用一些version control系统（我强烈推荐git）。

当然，您希望能够解释递归函数。没有比非递归更难解释。

^{PS。我对你的工作很感兴趣。请给我发一些电子邮件，和/或发布您的暂定源代码。}

C ++从文件运行实际代码而不编译它

1 个答案:

更简单的方法