如何在独立工具中使用Clang Static Analyzer自定义检查器,或从此类工具调用Clang Static Analyzer?

时间:2019-07-20 03:25:33

标签: c++ clang static-analysis libclang

您好,软件开发人员。

我有一个问题。我正在从事一个软件开发项目。在这个项目中,我正在创建一个使用LLVM / Clang库的独立重构工具。该软件开发项目需要对C文件进行源到源重写,并对它们执行重构。这些重构包括:删除不支持的函数,重写有问题的构造,删除全局变量,将数组的动态内存分配转换为可变长度堆栈数组。我正在为此任务使用AST匹配器。我的代码包括以下库:

#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/Tooling/Core/Replacement.h"
#include "clang/Tooling/Refactoring.h"
#include "llvm/Support/CommandLine.h"

基本上,我的重构工具是C ++命令行应用程序。它获取* .c源文件列表,使用AST匹配器和回调类对这些文件执行重构,然后将修改保存到相同的文件。为了使我的项目的结构更清晰,我上载了主要的C ++源文件。该文件包括实现重构的回调类的所有头文件。

/*
 * As part of LLVM coding standards, you should organize your include statements by putting local headers
 * first, followed by Clang and LLVM API headers. When two headers pertain to the same category,
 * order them alphabetically.
 */
// Header files for CallBack classes, each of which implements a single refactoring.
#include "FindVariablesMatchCallback.h"
#include "MakeStaticMatchCallback.h"
#include "RemoveHypotMatchCallback.h"
#include "RemoveMemcpyMatchCallback.h"
#include "RemovePointerMatchCallback.h"
#include "RemoveVariablesMatchCallback.h"

// Header files for Clang libraries.
#include "clang/ASTMatchers/ASTMatchFinder.h"
#include "clang/Tooling/Refactoring.h"
#include "llvm/Support/CommandLine.h"

using llvm::outs;
using llvm::errs;
using llvm::raw_ostream;
using llvm::report_fatal_error;

using llvm::cl::opt;
using llvm::cl::desc;
using llvm::cl::Positional;
using llvm::cl::OneOrMore;
using llvm::cl::extrahelp;
using llvm::cl::ValueRequired;
using llvm::cl::SetVersionPrinter;
using llvm::cl::ParseCommandLineOptions;

using clang::tooling::RefactoringTool;
using clang::tooling::newFrontendActionFactory;
using clang::tooling::CompilationDatabase;
using clang::tooling::FixedCompilationDatabase;
using clang::ast_matchers::MatchFinder;

/* Command line options description: */

// <bool> Says that this option takes no argument, and is to be treated as a bool value only.
// If this option is set, then the variable becomes true, otherwise it becomes false.
// This line of code has to be above the ParseCommandLineOptions function call, otherwise
// it will fail to parse the -debug option out, and the refactoring_tool executable will
// fail to recognize that command line option.
opt<bool> DebugOutput("debug", desc("This option enables diagnostic output."));

// Options to turn on various refactorings are optional.
opt<bool> RunAll("all", desc("This options turns on all supported refactorings."));
opt<bool> RunRemoveMemcpy("remove-memcpy", desc("This option turns on replacement of memcpy()."));
opt<bool> RunMakeStatic("make-static", desc("This option turns all dynamic memory allocations "
                                            "into stack ones, gets rid of calloc() and free()."));
opt<bool> RunRemovePointer("remove-pointer", desc("This option turns on removal of the global pointer."));
opt<bool> RunRemoveHypot("remove-hypot", desc("This option turns on replacement of hypot()."));
opt<bool> RunRemoveVariables("remove-variables", desc("This option removes unreferenced variables."));

// Option specifies the build path/directory.
opt<string> BuildPath(Positional, desc("[<build-path>]"));
// Options specifying the source files to refactor are one or more required.
opt<string> SourcePaths(Positional, desc("<source0> [... <sourceN>]"), OneOrMore, ValueRequired);

// Define an additional help message to be printed.
extrahelp CommonHelp(
    "\nArguments above mentioned in [ ] are optional (not required).\n"
    "<build-path> should be specified if specific compiler options\n"
    "are not provided on the command line.\n"
);

// Global variable is non static so that it can be externed into other translation units.
bool print_debug_output = false;


int main(int argc, const char **argv)
{
    // Format should be:
    // $ refactoring_tool tool_specific options -- clang_specific_options (not used)
    //  OR
    // $ refactoring_tool [options] [<build-path>] <source0> [... <sourceN>] --
    // By default, input file(s) are treated as positional arguments of the tool-specific part
    // of the options.

    /* Command line parsing: */

    // Define the information to be printed with the -version option.
    // Use a C++11 lambda function as the VersionPrinterTy func parameter to SetVersionPrinter().
    // Adjacent string literals are automatically concatenated in C and C++.
    SetVersionPrinter(
        [](raw_ostream& os) {
            const string version_information = "McLeod Refactoring Tool\n"
                                               "By Konstantin Rebrov\n"
                                               "development version 2.5\n";
            os << version_information;
        }
    );

    // Parses the command line arguments for you.
    ParseCommandLineOptions(argc, argv);
    string ErrorMessage;

    // Try to build a compilation datavase directly from the command-line.
    std::unique_ptr<CompilationDatabase> Compilations(
        FixedCompilationDatabase::loadFromCommandLine(argc, argv, ErrorMessage)
    );
    // If that failed.
    if (!Compilations) {
        // Load the compilation database using the given directory.
        // Destroys the old object pointed to by the unique_ptr (if it exists), and acquires
        // ownership of the rhs unique_ptr (or rather the underlying CompilationDatabase that it's
        // pointing to).
        Compilations = CompilationDatabase::loadFromDirectory(BuildPath, ErrorMessage);
        // And if that failed.
        if (!Compilations) {
            errs() << "ERROR: Could not build compilation database.\n";
            // Calls installed error handlers, prints a message generated by the llvm standard
            // library, and gracefully exits the program.
            report_fatal_error(ErrorMessage);
        }
    }

    // ParseCommandLineOptions has to be called before DebugOutput can be used.
    // It sets the value of the DebugOutput depending on the presence or absence of the
    // -debug flag in the command line arguments.
    // cl::opt<T> is a class which has an operator T() method, in this case it is used to convert
    // to bool. It's easier to work with build in data types than classes.
    print_debug_output = DebugOutput;

    // If the user specified -all option, then all refactorings should be enabled.
    if (RunAll) {
        RunRemoveMemcpy    = true;
        RunMakeStatic      = true;
        RunRemovePointer   = true;
        RunRemoveHypot     = true;
        RunRemoveVariables = true;
    }

    /* Run the Clang compiler for the each input file separately
     * (one input file - one output file).
     *  This is default ClangTool behaviour.
     */
    // The first argument is a list of compilations.
    // The second argument is a list of source files to parse.
    RefactoringTool tool(*Compilations, SourcePaths);

    if (print_debug_output) {
        outs() << "Starting match finder\n";
        outs() << "\n\n";
    }

    // This first MatchFinder is responsible for applying replacements in the first round.
    MatchFinder mf;

    /* Only add the AST matchers if the options enabling these refactorings are activated. */

    //// Remove memcpy details
    // Make the RemoveMemcpyMatchCallback class be able to recieve the match results.
    RemoveMemcpyMatchCallback remove_memcpy_match_callback(&tool.getReplacements());
    if (RunRemoveMemcpy) {
        remove_memcpy_match_callback.getASTmatchers(mf);
    }

    //// Make static details
    MakeStaticMatchCallback make_static_match_callback(&tool.getReplacements());
    if (RunMakeStatic) {
        make_static_match_callback.getASTmatchers(mf);
    }

    //// Remove pointer details
    RemovePointerMatchCallback remove_pointer_match_callback(&tool.getReplacements());
    if (RunRemovePointer) {
        remove_pointer_match_callback.getASTmatchers(mf);
    }

    //// Remove hypot details
    RemoveHypotMatchCallback remove_hypot_match_callback(&tool.getReplacements());
    if (RunRemoveHypot) {
        remove_hypot_match_callback.getASTmatchers(mf);
    }

    //// Remove variables details
    // NOTE: default constuctor takes no arguments.
    // FindVariablesMatchCallback does not do any replacements, it only counts the variables.
    FindVariablesMatchCallback find_variables_match_callback;
    if (RunRemoveVariables) {
        find_variables_match_callback.getASTmatchers(mf);
    }

    // Run the tool
    auto result = tool.runAndSave(newFrontendActionFactory(&mf).get());
    if (result != 0) {
        errs() << "Error in the Refactoring Tool: " << result << "\n";
        return result;
    }

    // Create a second RefactoringTool to run the second round of refactorings.
    // The first argument is a list of compilations.
    // The second argument is a list of source files to parse.
    RefactoringTool tool2(*Compilations, SourcePaths);
    // Create a second MatchFinder to run the new RefactoringTool through the source code again,
    // to apply replacements in the second round, mainly RemoveVariablesMatchCallback.
    MatchFinder mf2;

    //// Remove variables details
    // This one actually removes the variables.
    RemoveVariablesMatchCallback remove_variables_match_callback(&tool.getReplacements());
    if (RunRemoveVariables) {
        /* These Step 1 and Step 2 MUST be called in this order ALWAYS! */

        // Step 1: Get the list of variables to remove from the find_variables_match_callback.
        // Connect the remove_variables_match_callback with the find_variables_match_callback.
        find_variables_match_callback.collectResults(remove_variables_match_callback.getVector());

        // Step 2: Get the AST matchers describing them.
        remove_variables_match_callback.getASTmatchers(mf2);
    }

    // Run the new Refactoring Tool to apply replacements in the second round.
    auto result2 = tool2.runAndSave(newFrontendActionFactory(&mf2).get());
    if (result2 != 0) {
        errs() << "Error in the Refactoring Tool: " << result2 << "\n";
        return result2;
    }

    // Print diagnostic output.
    if (print_debug_output) {
        unsigned int num_refactorings = 0;

        if (RunRemoveMemcpy) {
            unsigned int num_matches_found = remove_memcpy_match_callback.getNumMatchesFound();
            unsigned int num_replacements = remove_memcpy_match_callback.getNumReplacements();
            num_refactorings += num_replacements;
            outs() << "Found " << num_matches_found << " memcpy() matches\n";
            outs() << "Performed " << num_replacements << " memcpy() replacements\n";
        }

        if (RunMakeStatic) {
            unsigned int num_free_calls = make_static_match_callback.num_free_calls();
            unsigned int num_calloc_calls = make_static_match_callback.num_calloc_calls();
            num_refactorings += num_free_calls;
            num_refactorings += num_calloc_calls;
            outs() << "Found " << num_free_calls << " calls to free()\n";
            outs() << "Found " << num_free_calls << " calls to calloc()\n";
        }

        if (RunRemovePointer) {
            unsigned int num_global_pointers = remove_pointer_match_callback.getNumGlobalPointerRemovals();
            unsigned int num_pointer_uses = remove_pointer_match_callback.getNumPointerUseReplacements();
            unsigned int num_pointer_dereferences = remove_pointer_match_callback.getNumPointerDereferenceReplacements();
            num_refactorings += num_global_pointers;
            num_refactorings += num_pointer_uses;
            num_refactorings += num_pointer_dereferences;
            outs() << "Removed " << num_global_pointers << " global pointers.\n";
            outs() << "Replaced " << num_pointer_uses << " pointer uses.\n";
            outs() << "Replaced " << num_pointer_dereferences << " pointer dereferences.\n";
        }

        if (RunRemoveHypot) {
            unsigned int num_hypot_replacements = remove_hypot_match_callback.getNumHypotReplacements();
            num_refactorings += num_hypot_replacements;
            outs() << "Replaced " << num_hypot_replacements << " calls to hypot()\n";
        }

        if (RunRemoveVariables) {
            unsigned int num_unused_variables = remove_variables_match_callback.getNumUnusedVariableRemovals();
            num_refactorings += num_unused_variables;
            outs() << "Removed " << num_unused_variables << " unused variables.\n";
        }

        outs() << '\n' << "Performed " << num_refactorings << " total refactorings\n";
    }

    return 0;
}

此重构工具在C文件的大型旧代码库上运行。为了执行明显的重构,我一直在使用AST匹配器。现在我已经用尽了它们的功能,这意味着我已经完成了使用AST匹配器实现明显的重构的过程。现在,为了实现更高级的重构,我需要使用Clang Static Analyzer库。我一直在学习如何使用此讲座为Clang静态分析器创建自己的检查器: 2012年LLVM开发人员会议:A. Zaks&J. Rose“ 24小时内构建检查器” https://www.youtube.com/watch?v=kdxlsP5QVPw

我还购买了Bruno Cardoso Lopes和Rafael Auler所著的“ LLVM核心库入门”这本书。该书的第9章教您如何编写自己的Clang静态分析器检查器。

如果我理解正确,那么使用您自己的自定义检查器的通常方法是将其安装到Clang Static Analyzer。然后,使用XCode之类的IDE或通过命令行本身运行Clang静态分析器,就像通常那样运行:clang --analyze -Xanalyzer -analyzer-checker=core source.c

我这里有个问题。为了使您理解,我应该向您解释我的任务。我需要检测冗余计算并返回一些结果的函数调用。将它们分配给某个变量,然后再也不会引用该变量:

variable = sqrt(pow(variable, 3) + pow(variable, 3) * 8) / 2;

不要问我为什么这样的代码,我没有写。有很多这样的例子。我重构的代码库也应该针对超高性能进行优化,因此,即使从不使用结果,我们也不能像这样进行冗余计算。我的工作是创建一个重构工具来搜索和替换这些不良构造。

当我通过Clang Static Analyzer运行上述代码时,它显示以下消息:

warning: Value stored to 'variable' is never read

我需要利用静态分析器的功能来检测诸如此类的有问题的代码结构。我正在尝试制作自己的检查程序,该检查程序不仅可以检测到此类事件并发出警告,而且还可以完全不创建这些事件。我希望我的检查程序仅编辑源文件。就像AST匹配器回调一样处理并执行替换操作。

我的直接主管说,我正在编写的整个C ++重构工具必须是单独的单个可执行文件,并且它需要处理源代码文件并就地执行重构。现在,在此重构工具的主要功能中,我正在调用Clang编译器并为其提供AST匹配器:

// Run the tool
    auto result = tool.runAndSave(newFrontendActionFactory(&mf).get());
    if (result != 0) {
        errs() << "Error in the Refactoring Tool: " << result << "\n";
        return result;
    }

由于我还将编写自定义静态分析检查器,因此我还需要将其添加到主函数中,并且需要某种方式从可执行文件的主函数中调用Clang静态分析器。是否有类似于clang::tooling::RefactoringTool::runAndSave()的功能,但对于Clang Static Analyzer,该功能将安装我的自定义静态分析检查器,并在可执行工具的命令行中提供的旧源代码文件上运行Clang Static Analyzer。这就是我所需要的。

如果没有这样的“即食即食”功能,那么Clang Static Analyzer的工作原理如何,LLVM库中的哪种代码可以启动Clang Static Analyzer并在输入文件上运行它?如果有必要,我将创建一个这样的函数来从我自己的代码中调用Clang Static Analyzer,然后像这样使用它。我不怕深入研究LLVM源代码,以了解如何在自己的代码中复制Clang Static Analyzer的启动和文件处理行为。但是我认为那将是做事的艰难方式。如果有一种更简便的方法以编程方式使用自定义检查器调用Clang静态分析器,请告诉我。

我希望我的静态分析检查器实际执行替换操作,以删除找到的有问题的代码构造。为此,我认为我需要知道这些检测到的代码构造的SourceLocation。仅将警告打印到屏幕上对我来说是不够的。 C代码库很大,手动进行替换会花费大量时间。 我观看了2012年LLVM开发人员会议的演讲,Anna Zaks和Jordan Rose并没有真正解释如何在静态分析检查器中执行替换操作。他们只是让其自定义检查器在屏幕上显示警告。但是,正如我所解释的,我的要求是不同的。 同样,要求之一是我的应用程序需要就地执行替换。公司需要一个独立的内部工具。整个过程必须是无缝的。我的应用程序需要其他工程师轻松使用。他们是电气工程师,对此一无所知,所以这就是我为他们自动化所有这些东西的原因。

到目前为止,我已经使用AST匹配器取得了很大的进步,现在我需要使用静态分析检查器扩展我的应用程序,以直接主管的工作。如果我在某个地方错了,请不要骂我太多。这是我的第一项真正的开发工作,也是我第一次使用Clang / LLVM库。

0 个答案:

没有答案