查找并替换VBA模块中的所有变量名称

时间:2014-05-20 12:58:53

标签: vba obfuscation

假设我们有一个模块,其中只有一个Sub,并且没有注释。如何识别所有变量名称?是否可以识别未使用Dim定义的变量名称?我想识别它们并用一些随机名称替换它们来混淆我的代码(例如O0011011010100101),更换部分要容易得多。

可在宏,函数和变量名称中使用的字符列表:

ABCDEFGHIJKLMNOPQRSTUVWXYZdefghijklmnopqrstuvwxyzg€‚„…†‡‰Š‹ŚŤŽŹ‘’“”•–—™š›śťžź ˇ˘Ł¤Ą¦§¨©Ş«¬­®Ż°±˛ł´µ¶·¸ąş»Ľ˝ľżŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙ÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙

以下是我的功能我已经写过了:

Function randomName(n as integer) as string
   y="O"
   For i = 2 To n:
       If Rnd() > 0.5 Then
          y = y & "0"
       Else
          y = y & "1"
       End If
    Next i

   randomName=y
End Function

在目标中替换另一个字符串中的给定字符串,该字符串表示我在sub:

下面使用的模块代码
Sub substituteNames()
    'count lines in "Module1" which is part of current workbook
    linesCount = ActiveWorkbook.VBProject.VBComponents("Module1").CodeModule.CountOfLines
    'read code from module
    code = ActiveWorkbook.VBProject.VBComponents("Module1").CodeModule.Lines(StartLine:=1, Count:=linesCount)

    inputStr = Array("name1", "name2", "name2") 'some hardwritten array with string to replace
    namesLength = 20                            'length of new variables names

    For i = LBound(inputStr) To UBound(inputStr)
       outputString = randomName(namesLength-1)
       code = Replace(code, inputStr(i), outputString)
    Next i

    Debug.Print code 'view code
End Sub

然后我们简单地用新的代码替换旧代码,但是如何识别具有变量名称的字符串?

使用** Option Explicit **降低了我简单的混淆方法的安全性,因为要反转变化,你只需要遵循Dim语句并用正常的东西替换丑陋的名字。除了为了使这种替换变得更难,我认为在变量名称的中间划分界限是个好主意:

O0O000O0OO0O0000 _
0O00000O0OO0

简单方法也是基于chr函数chr(104)&chr(101)&chr(108)&chr(108)&chr(111)用链替换一些字符串:

Sub stringIntoChrChain()

    strInput = "hello"
    strOutput = ""

    For i = 1 To Len(strInput)
        strOutput = strOutput & "chr(" & Asc(Mid(strInput, i, 1)) & ")&"
    Next i

    Debug.Print Mid(strOutput, 1, Len(strOutput) - 1)
End Sub
下面的评论可以给用户留下印象,让他觉得他没有提供处理宏等的正确工具。

'(k=Äó¬)w}ż^¦ů‡ÜOyúm=ěËnóÚŽb W™ÄQó’ (—*-ĹTIäb
'R“ąNPÔKZMţ†üÍQ‡
'y6ű˛Š˛ŁŽ¬=iýQ|˛^˙  ‡ńb ¬ĂÇr'ń‡e˘źäžŇ/âéç;1qýěĂj$&E!V?¶ßšÍ´cĆ$Âű׺Ůî’ﲦŔ?TáÄu[nG¦•¸î»éüĽ˙xVPĚ.|
'ÖĚ/łó®Üă9Ę]ż/ĹÍT¶Mµę¶mÍ
'q[—qëýY~Pc©=jÍ8˘‡,Ú+ń8ŐűŻEüńWü1ďëDZ†ć}ęńwŠbŢ,>ó’Űçµ™Š_…qÝăt±+‡ĽČg­řÍ!·eŠP âńđ:ŶOážű?őë®ÁšńýĎáËTbž}|Ö…ăË[®™

1 个答案:

答案 0 :(得分:2)

您可以使用正则表达式通过查找等号来查找变量赋值。您需要添加对 Microsoft VBScript正则表达式5.5 Microsoft Visual Basic for Applications可扩展性5.3 库的引用,因为我已经使用了早期绑定。

请务必备份您的工作并在使用前进行测试。我本来可以把正则表达式弄错了。

<强>更新

我已经改进了正则表达式,因此它不再捕获强类型常量的数据类型(先前返回Const ImAConstant As String = "Oh Noes!" String)。我还添加了另一个正则表达式来返回这些常量。正则表达式的最后一个版本也错误地捕获了.Global = true之类的内容。那已得到纠正。下面的代码应返回给定代码模块的所有变量和常量名称。正则表达式仍然不完美,因为你会注意到我无法阻止双引号上的误报。此外,我的阵列处理可以做得更好。

Sub printVars()
    Dim linesCount As Long
    Dim code As String
    Dim vbPrj As VBIDE.VBProject
    Dim codeMod As VBIDE.CodeModule
    Dim regex As VBScript_RegExp_55.RegExp
    Dim m As VBScript_RegExp_55.match
    Dim matches As VBScript_RegExp_55.MatchCollection
    Dim i As Long
    Dim j As Long
    Dim isInDatatypes As Boolean
    Dim isInVariables As Boolean
    Dim datatypes() As String
    Dim variables() As String

    Set vbPrj = VBE.ActiveVBProject
    Set codeMod = vbPrj.VBComponents("Module1").CodeModule
    code = codeMod.Lines(1, codeMod.CountOfLines)

    Set regex = New RegExp
    With regex
        .Global = True ' match all instances
        .IgnoreCase = True
        .MultiLine = True ' "code" var contains multiple lines
        .Pattern = "(\sAs\s)([\w]*)(?=\s)" ' get list of datatypes we've used
            ' match any whole word after the word " As "
        Set matches = .Execute(code)
    End With

    ReDim datatypes(matches.count - 1)
    For i = 0 To matches.count - 1
        datatypes(i) = matches(i).SubMatches(1) ' return second submatch so we don't get the word " As " in our array
    Next i

    With regex
        .Pattern = "(\s)([^\.\s][\w]*)(?=\s\=)" ' list of variables
            ' begins with a space; next character is not a period (handles "with" assignments) or space; any alphanumeric character; repeat until... space
        Set matches = .Execute(code)
    End With

    ReDim variables(matches.count - 1)
    For i = 0 To matches.count - 1
    isInDatatypes = False
    isInVariables = False
        ' check to see if current match is a datatype
        For j = LBound(datatypes) To UBound(datatypes)
            If matches(i).SubMatches(1) = datatypes(j) Then
                isInDatatypes = True
                Exit For
            End If
            'Debug.Print matches(i).SubMatches(1)
        Next j
        ' check to see if we already have this variable
        For j = LBound(variables) To i
            If matches(i).SubMatches(1) = variables(j) Then
                isInVariables = True
                Exit For
            End If
        Next j
        ' add to variables array
        If Not isInDatatypes And Not isInVariables Then
            variables(i) = matches(i).SubMatches(1)
        End If
    Next i

    With regex
        .Pattern = "(\sConst\s)(.*)(?=\sAs\s)" 'strongly typed constants
            ' match anything between the words " Const " and " As "
        Set matches = .Execute(code)
    End With

    For i = 0 To matches.count - 1
        'add one slot to end of array
        j = UBound(variables) + 1
        ReDim Preserve variables(j)
        variables(j) = matches(i).SubMatches(1) ' again, return the second submatch
    Next i

    ' print variables to immediate window
    For i = LBound(variables) To UBound(variables)
        If variables(i) <> "" And variables(i) <> Chr(34) Then ' for the life of me I just can't get the regex to not match doublequotes
            Debug.Print variables(i)
        End If
    Next i
End Sub