使用正则表达式匹配free()和malloc()调用

时间:2016-07-28 17:43:17

标签: regex powershell powershell-v5.0

我正在创建一个powershell脚本来解析包含C代码的文件,并检测它是否包含对 free() malloc() realloc()函数。

file_one.c

int MethodOne()  
{
    return 1;
}   
int MethodTwo()    
{   
    free();
    return 1;
} 

file_two.c

int MethodOne()  
{
    //free();
    return 1;
}
int MethodTwo()    
{       
    free();
    return 1;
} 

check.ps1

$regex = "(^[^/]*free\()|(^[^/]*malloc\()|(^[^/]*realloc\()"
$file_one= "Z:\PATH\file_one.txt"
$file_two= "Z:\PATH\file_two.txt"

$contentOne = Get-Content $file_one -Raw 
$contentOne -match $regex

$contentTwo = Get-Content $file_two-Raw 
$contentTwo -match $regex

在一段时间内处理整个文件似乎与 contentOne 一起使用, 实际上我得到 True (因为MethodTwo中的free())。 处理 contentTwo 并不是那么幸运,并返回False而不是True (因为MethodTwo中的free())。
有人可以帮我写一个更好的正则表达式,在两种情况下都有效吗?

1 个答案:

答案 0 :(得分:1)

当然,这就是它

原始:

^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n))|(?:"[^"\\]*(?:\\[\S\s][^"\\]*)*"|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/"'\\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())

Stringed:

"^(?>(?:/\\*[^*]*\\*+(?:[^/*][^*]*\\*+)*/|//(?:[^\\\\]|\\\\(?:\\r?\\n)?)*?(?:\\r?\\n))|(?:\"[^\"\\\\]*(?:\\\\[\\S\\s][^\"\\\\]*)*\"|'[^'\\\\]*(?:\\\\[\\S\\s][^'\\\\]*)*'|(?!\\b(?:free|malloc|realloc)\\()[\\S\\s](?:(?!\\b(?:free|malloc|realloc)\\()[^/\"'\\\\])*))*(?:(\\bfree\\()|(\\bmalloc\\()|(\\brealloc\\())"

逐字:

@"^(?>(?:/\*[^*]*\*+(?:[^/*][^*]*\*+)*/|//(?:[^\\]|\\(?:\r?\n)?)*?(?:\r?\n))|(?:""[^""\\]*(?:\\[\S\s][^""\\]*)*""|'[^'\\]*(?:\\[\S\s][^'\\]*)*'|(?!\b(?:free|malloc|realloc)\()[\S\s](?:(?!\b(?:free|malloc|realloc)\()[^/""'\\])*))*(?:(\bfree\()|(\bmalloc\()|(\brealloc\())"

解释

 ^ 
 (?>
      (?:                              # Comments 
           /\*                              # Start /* .. */ comment
           [^*]* \*+
           (?: [^/*] [^*]* \*+ )*
           /                                # End /* .. */ comment
        |  
           //                               # Start // comment
           (?:                              # Possible line-continuation
                [^\\] 
             |  \\ 
                (?: \r? \n )?
           )*?
           (?: \r? \n )                     # End // comment
      )
   |                                 # OR,

      (?:                              # Non - comments 
           "
           [^"\\]*                          # Double quoted text
           (?: \\ [\S\s] [^"\\]* )*
           "
        |  '
           [^'\\]*                          # Single quoted text
           (?: \\ [\S\s] [^'\\]* )*
           ' 
        |                                 # OR,

           (?!                              # ASSERT: Here, cannot be free / malloc / realloc {}
                \b 
                (?: free | malloc | realloc )
                \(
           )
           [\S\s]                           # Any char which could start a comment, string, etc..
                                            # (Technically, we're going past a C++ source code error)

           (?:                              # -------------------------
                (?!                              # ASSERT: Here, cannot be free / malloc / realloc {}
                     \b 
                     (?: free | malloc | realloc )
                     \(
                )

                [^/"'\\]                         # Char which doesn't start a comment, string, escape,
                                                 # or line continuation (escape + newline)
           )*                               # -------------------------
      )                                # Done Non - comments 
 )*

 (?:
      ( \b free\( )                    # (1), Free()
   |  
      ( \b malloc\( )                  # (2), Malloc()
   |  
      ( \b realloc\( )                 # (3), Realloc()
 )

一些注意事项:

这只能使用^锚从字符串的开头找到第一个 要全部找到它们,只需从正则表达式中删除^即可。

这是有效的,因为它可以匹配您所寻找的所有内容 在这种情况下,它发现的是捕获组1,2或3.

祝你好运!!

正则表达式包含什么:

----------------------------------
 * Format Metrics
----------------------------------
Atomic Groups       =   1

Cluster Groups      =   10

Capture Groups      =   3

Assertions          =   2
       ( ? !        =   2

Free Comments       =   25
Character Classes   =   12

修改
根据请求,解释处理
的正则表达式部分 /**/评论。这个 - > /\*[^*]*\*+(?:[^/*][^*]*\*+)*/

这是一个修改的展开循环正则表达式,它采用开始分隔符
/*的{​​{1}}和*/的结尾 请注意,打开/关闭在其分隔符中共享一个共同字符/ 序列。
为了能够在没有环绕断言的情况下执行此操作,使用了一种方法 将尾随分隔符的星号移到循环内。
使用此分解,所需的全部内容是检查结束/ 完成分隔序列。

 /\*              # Opening delimiter /*

 [^*]*            # Optionally, consume all non-asterisks

 \*+              # This must be 1 or more asterisks anchor's or FAIL.
                  # This is matched here to align the optional loop below
                  # because it is looking for the closing /.

 (?:              # The optional loop part
      [^/*]            # Specifically a single non / character (nor asterisk).
                       # Since a / will be the next closing delimiter, it must be excluded.

      [^*]*            # Optional non-asterisks.
                       # This will accept a / because it is supposed to consume ALL
                       # opening delimiter's as it goes
                       # and will consider the very next */ as a close.

      \*+              # This must be 1 or more asterisks anchor's or FAIL.
 )*               # Repeat 0 to many times.

 /                # Closing delimiter /