如何根据内容分组文本文件?

时间:2015-07-13 12:02:11

标签: python

我有2个文本文件,其中包含以下内容:

文本文件1: -

Imports
KERNEL32.DLL
    0x40d1f0 LoadLibraryA
    0x40d1f4 GetProcAddress
    0x40d1f8 ExitProcess
ADVAPI32.dll
    0x40d200 RegOpenKeyA
Exports

文字文件2: -

Imports
KERNEL32.DLL
    0x419128 LoadLibraryA
    0x41912c GetProcAddress
    0x419130 ExitProcess
advapi32.dll
    0x419138 RegCloseKey
oleaut32.dll
    0x419140 SysFreeString
Exports

我想要一个包含常见输出的文本文件,以便根据 .DLL名称对它们进行分组。例如;这是我想要的输出文件:

 KERNEL32.DLL
        0x40d1f0 LoadLibraryA
        0x40d1f4 GetProcAddress
        0x40d1f8 ExitProcess
 ADVAPI32.dll
        0x40d200 RegOpenKeyA
        0x419138 RegCloseKey
 oleaut32.dll
        0x419140 SysFreeString

我编写了一个脚本,它将读取文件名并继续将输出附加到文本文件 final.txt 。我可以使用Python的哪些功能根据标题对值进行分组?

#start
import sys
value = sys.argv[1]
print value                 # Value is the filename
with open(value) as inputd: # Parse file
    for line in inputd:
        if line.strip() == 'Imports':  
            break
    for line in inputd: 
        if line.strip() == 'Exports':
            break
    if "none" not in line:
            print line.rstrip()  # print the line
            with open("final.txt", "a") as outputd:
                outputd.write(line) # write output to file
#end 

我目前的输出如下:

 KERNEL32.DLL
        0x40d1f0 LoadLibraryA
        0x40d1f4 GetProcAddress
        0x40d1f8 ExitProcess
 ADVAPI32.dll
        0x40d200 RegOpenKeyA
 KERNEL32.DLL
        0x419128 LoadLibraryA
        0x41912c GetProcAddress
        0x419130 ExitProcess
 advapi32.dll
        0x419138 RegCloseKey
 oleaut32.dll
        0x419140 SysFreeString

1 个答案:

答案 0 :(得分:0)

不是一个完美的方法,但我实施了一个解决方法,这应该做:)

import sys
value = sys.argv[1]
print value     

existing_dlls  = []
with open("final.txt", "r") as outputd:         #if this file does not exists initially make(an empty file) it else File not find Error will be thrown
    for line in outputd:
        if line.lower().find(".dll")> -1:
            existing_dlls.append(line.strip().lower())

dontwrite = False # Flag to keep track of address to be written

with open("final.txt", "a") as outputd:         #dont open this file for every write, rather keep it open for write operations
    with open(value) as inputd: # Parse file
        for line in inputd:                         #dont iterate again and again for similar checks, if statements are mutually exclusive
            if line.strip() == 'Imports':  
                continue           
            elif line.strip() == 'Exports':
                continue
            elif "none" not in line:
                if line.strip().lower().endswith(".dll"):
                    if line.strip().lower() in existing_dlls:
                        dontwrite = True
                    else : 
                        dontwrite = False
                if not dontwrite:            
                    outputd.write(line)