用python解析cpp文件以查找函数拥有start" {"结束"}"

时间:2018-05-21 20:49:43

标签: python parsing

嗨,我是编程的初学者,但有一些基本的了解。 我试图做一个python脚本,给出一个像这样的cpp文件:

///////////////////////////////////
// experimentOrientVertices.cpp
// (modifying square.cpp)
// 
// Sumanta Guha.
///////////////////////////////////

#ifdef __APPLE__
#  include <GL/glew.h>
#  include <GL/freeglut.h>
#  include <OpenGL/glext.h>
#else
#  include <GL/glew.h>
#  include <GL/freeglut.h>
#  include <GL/glext.h>
#pragma comment(lib, "glew32.lib") 
#endif

// Drawing routine.
void drawScene(void)
{
   glClear(GL_COLOR_BUFFER_BIT);

   glColor3f(0.0, 0.0, 0.0);

   // Draw a polygon with specified vertices.
   glPolygonMode(GL_FRONT, GL_LINE);
   glPolygonMode(GL_BACK, GL_FILL);
   glBegin(GL_POLYGON);
      glVertex3f(20.0, 20.0, 0.0);
      glVertex3f(80.0, 20.0, 0.0);
      glVertex3f(80.0, 80.0, 0.0);
      glVertex3f(20.0, 80.0, 0.0);
   glEnd();

   /*
   glPolygonMode(GL_FRONT, GL_LINE);
   glPolygonMode(GL_BACK, GL_FILL);
   glBegin(GL_POLYGON);
      glVertex3f(20.0, 80.0, 0.0);
      glVertex3f(20.0, 20.0, 0.0);
      glVertex3f(80.0, 20.0, 0.0);
      glVertex3f(80.0, 80.0, 0.0);
   glEnd();
   */

   /*
   glPolygonMode(GL_FRONT, GL_LINE);
   glPolygonMode(GL_BACK, GL_FILL);
   glBegin(GL_POLYGON);
      glVertex3f(80.0, 80.0, 0.0);
      glVertex3f(80.0, 20.0, 0.0);
      glVertex3f(20.0, 20.0, 0.0);
      glVertex3f(20.0, 80.0, 0.0);
   glEnd();
   */

   glFlush(); 
}

// Initialization routine.
void setup(void) 
{
   glClearColor(1.0, 1.0, 1.0, 0.0); 
}

// OpenGL window reshape routine.
void resize(int w, int h)
{
   glViewport(0, 0, w, h);
   glMatrixMode(GL_PROJECTION);
   glLoadIdentity();
   glOrtho(0.0, 100.0, 0.0, 100.0, -1.0, 1.0);
   glMatrixMode(GL_MODELVIEW);
   glLoadIdentity();
}

// Keyboard input processing routine.
void keyInput(unsigned char key, int x, int y)
{
   switch(key) 
   {
      case 27:
         exit(0);
         break;
      default:
         break;
   }
}

// Main routine.
int main(int argc, char **argv) 
{
   glutInit(&argc, argv);

   glutInitContextVersion(4, 3);
   glutInitContextProfile(GLUT_COMPATIBILITY_PROFILE);

   glutInitDisplayMode(GLUT_SINGLE | GLUT_RGBA); 
   glutInitWindowSize(500, 500);
   glutInitWindowPosition(100, 100); 
   glutCreateWindow("experimentOrientVertices.cpp");
   glutDisplayFunc(drawScene); 
   glutReshapeFunc(resize);  
   glutKeyboardFunc(keyInput);

   glewExperimental = GL_TRUE;
   glewInit();

   setup(); 

   glutMainLoop(); 
}

应该输出:

[['21=>59']]
[['63=>65']]
[['69=>76']]
[['80=>89'], ['82=>88']]
[['93=>113']]

这就是问题,下面的脚本输出:

[['21=>59']]
[['63=>65'], ['65=>63']]
[['69=>76'], ['76=>69']]
[['80=>89'], ['82=>88'], ['88=>82'], ['89=>80']]
[['93=>113'], ['113=>93']]

而且......我真的无法弄明白这一点 谢谢你的时间,我希望能把这个问题写好!

这是我的剧本:

from sys import argv

token = {
        '{': 0
        }
level = ["" for i in range(10)]

funcStartStop = [] #list to store each function start and stop brackets value

def functionCount(filename):
    inFile = open(filename, 'r')
    currLine = 0
    mbracketopenline = [] #list to store the currLine value if a { is found
    mbracketcloseline = [] #list to store the currLine value if a } is found
    first = True
    for line in inFile:
        currLine += 1
        if "{" in line:
            token["{"] += 1
            if first:
                first = False
            mbracketopenline.append(currLine)
        if "}" in line:
            token["{"] -= 1
            mbracketcloseline.append(currLine)
        if not first and token["{"] == 0:
            first = True
            tmpfuncStartStop = []
            for i in range(mgraffeopenline.__len__()):
                tmpfuncStartStop.append([str(str(mbracketopenline[i])+"=>"+str(mbracketcloseline[-i-1]))]) #store opening and
                                                                        #closing brackets in a list of list, so at the end of the cycle
                                                                        #should be something like: [['1=>10'],['3=>8']]
            funcStartStop.append(tmpfuncStartStop)
            mbracketcloseline = mbracketopenline = []

if __name__ == '__main__':
    functionCount(argv[1])
    for i in funcStartStop:
        print(i)

1 个答案:

答案 0 :(得分:0)

忽略在您为SO调整示例时所产生的代码错误(例如mgraffeopenline而不是mbracketopenlinefunctionCount()没有返回结果等),您有以下几点:< / p>

mbracketcloseline = mbracketopenline = []

有效地将mbracketcloselinembracketopenline设置为相同的新列表,每当您修改每个列表时,您都要修改相同的列表。这主要是导致您的问题的原因,但还有其他人需要考虑。

另外,正如我在评论中所述,在同一行上有多个括号是完全有效的,你需要考虑它们(包括它们的位置),所以你实际上需要逐个字符地查看你的代码并拿起每一个事件,例如:

def function_count(source):
    result = []
    progress = []  # a temporary list to keep our progress
    level = -1  # we'll use an index to reference the level we're at
    with open(source, "r") as f:  # open the `source` file for reading
        for i, line in enumerate(f):  # enumerate and read the file line by line
            # you can check `if "{" in line or "}" in line: ...` for a marginal speed-up
            for char in line:  # iterate over the characters in each line
                if char == "{":
                    if level == -1:  # we're at the root, start a root level
                        progress = []
                        result.append(progress)
                    progress.append([i + 1, None])  # store the line # at the level's start
                    level += 1  # increase the level
                elif char == "}":
                    if level > -1:  # if we're already deep in the source tree...
                        progress[level][1] = i + 1  # store the line # at the level's end
                        level -= 1  # decrease the level
    return result

您的文件产生:

[[[21, 59]],
 [[63, 65]],
 [[69, 76]],
 [[80, 89], [82, 88]],
 [[93, 113]]]

但是如果您要将keyInput功能更改为:

void keyInput(unsigned char key, int x, int y)
{ switch(key) {
      case 27:
         exit(0);
         break;
      default:
         break;
}}

它仍然会产生有效的结果:

[[[21, 59]],
 [[63, 65]],
 [[69, 76]],
 [[80, 86], [80, 86]],
 [[90, 110]]]

如果你想在你的例子中打印它(我发现实际的行索引分离得更有用),只需替换该行:

progress[level][1] = i + 1  # store the line # at the level's end

使用:

progress[level] = "{}=>{}".format(progress[level][0], i + 1)

但是,这并没有解决zvone提到的问题,你可以在注释,宏,字符串中使用大括号......所有这些都将被选为函数的有效开始/结束。除了简单的字符匹配之外,代码解析器还有很多。

如果您真的想深入研究解析C代码,那么有一个非常有用的模块pycparser就是为此而设计的。设置可能需要更长的时间,但它会提供比这种深入解析更多的代码洞察力。当然,这一切都取决于你的实际用例......