如何使用Python去除文本文件中的随机新行?

时间:2018-04-24 14:49:27

标签: python python-3.x

以下代码将采用' out.txt'的内容。并将其附加到' fixed_inv.txt'以新文件的形式,' concat.txt'基于 共享路径。

所有文件的示例行:

1)fixed_inv.txt

 70 abc.def.com HRShared$   \vol\cor_q_share1\HRShared  34  NULL    3   4479242 Share   1   1   CifsPerm: 0, CifsType: 0, Remark:Up-level share detected.   0   CIFS    NULL    ntap
 70 abc.def.com HRTraining$ \vol\cor_q_share1\HRTraining    35  NULL    4   4479243 Share   1   1   CifsPerm: 0, CifsType: 0, Remark:Up-level share detected.   0   CIFS    NULL    ntap
 70 abc.def.com psoft_prd$  \vol\cor_q_share1\psoft_prd 36  NULL    6   4479245 Share   1   1   CifsPerm: 0, CifsType: 0, Remark:Up-level share detected.   0   CIFS    NULL    ntap

2)out.txt

abcdef.ghi.com: \fs\FS11\cifs15\cifs-userhome-corp-prd_01: The inherited access control list (ACL) or access control entry (ACE) could not be built. (125), Access to file was denied (1)
abcdef.ghi.com: \fs\FS11\cifs17\cifs-userhome-corp-prd_03: The inherited access control list (ACL) or access control entry (ACE) could not be built. (45)
abcdef.ghi.com: \fs\FS11\cifs17\cifs-userhome-corp-prd_05: The inherited access control list (ACL) or access control entry (ACE) could not be built. (17)

3)concat.txt - >目标

在' concat.txt'文件,我得到几行(成千上万)似乎在所述行的中间有一个随机的新行。

例如,一条线应该看起来像:

122 abc.def.com Failed to get CIFS shares with error code -2147024891.  None Non-supported share access type.   0   Unkonwn NULL    bluearc Different Security Type (1), Access is denied. (1354), Pruned. Different security type (21), The inherited access control list (ACL) or access control entry (ACE) could not be built. (3713), Could not convert the name of inner file or directory (27)

但相反,我有一些看起来像:

122 abc.def.com Failed to get CIFS shares with error code -2147024891. None 
Non-supported share access type.   0   Unkonwn NULL    bluearc Different Security Type (1), Access is denied. (1354), Pruned. Different security type (21), The inherited access control list (ACL) or access control entry (ACE) could not be built. (3713), Could not convert the name of inner file or directory (27)

这是一堆行的示例(包括错误行): enter image description here 我试图在下面的代码中解决这个问题,但由于某些原因代码运行但没有解决问题 - 这是将错误的半行退格或者去掉随机的新行。

class Error:
    def __init__ (self, path, message): #self = new instance of class
        self.path = path
        self.message = message #error message
        self.matched = False #has the path from out.txt been matched to the path of fixed_inv.txt?

def open_files(file1, file2, file3):
    try:
        f1 = open(file1, 'r')
    except IOError: 
        print("Can't open {}".format(file1))
        return None, None, None #you can't just open one file you have to open all
    else:
        try:
            f2 = open(file2, 'r')
        except IOError: 
            print("Can't open {}".format(file2))
            f1.close()
            return None, None, None
        else:
            try:
                f3 = open(file3, 'w')
            except IOError: 
                print("Can't open {}".format(file3))
                f1.close()
                f2.close()
                return None, None, None
            else:
                return f1, f2, f3

def concat(file1, file2, file3):
    errors = {} #key: path, value: instance of class Error
    f1, f2, f3 = open_files(file1, file2, file3)
    prevLine = "" #NEW
    if f1 is not None: #if file one is able to open...
        with f1:
            for line_num, line in enumerate(f1): #get the line number and line
                line = line.replace("\\", "/") #account for the differences in backslashes
                tokens = line.strip().split(': ') #strip white spaces, split based on ':'
                if len(tokens) != 3: #if there's less than two tokens...
                    print('Error on line {} in file {}: Expected three tokens, but found {}'.format(line_num + 1, file1, len(tokens))) #error
                else: #NEW
                    if line.startswith('Non-supported'): #NEW
                        Prevline = line
                        Prevline = line.strip('\n') #NEW
                    else:
                        errors[tokens[1]] = Error(tokens[1], tokens[2]) 
        with f2: 
            with f3:
                for line_num, line in enumerate(f2):
                    line = line.replace("\\", "/").strip() #account for the differences in backslashes
                    tokens_2 = line.strip().split('\t') #strip white spaces, split based on tab
                    if len(tokens_2) < 4: #if we are unable to obtain the path by now since the path should be on 3rd or 4th index
                        print('Error on line {} in file {}: Expected >= 4 tokens, but found {}'.format(line_num + 1, file2, len(tokens_2)))
                        f3.write('{}\n'.format(line))
                    else: #if we have enough tokens to find the path...
                        if tokens_2[3] in errors: #if path is found in our errors dictionary from out.txt...
                            line.strip('\n')
                            path = tokens_2[3] #set path to path found
                            msg = errors[path].message #set the class instance of the value to msg                    
                            errors[path].matched = True #paths have been matched
                            f3.write('{}\t{}\n'.format(line, msg)) #write the line and the error message to concat
                        else: #if path is NOT found in our errors dictionary from out.txt...
                            f3.write('{}\t{}\n'.format(line, 'None'))  
                            print('Error on line {} in file {}: Path {} not matched'.format(line_num + 1, file2, tokens_2[3])) #found in fixed_inv.txt,
                            #but not out.txt

                """for e in errors: #go through errors
                    if errors[e].matched is False: #if no paths have been matched
                        print('Path {} from {} not matched in {}'.format(errors[e].path, file1, file2)) #found in out.txt, but not in fixed_inv
                        f3.write('{}\t{}\n'.format(line, 'No error present'))

def main():

    file1 = 'out.txt'
    file2 = 'fixed_inv.txt'
    file3 = 'test_concat.txt'

    concat(file1, file2, file3)

if __name__ == '__main__':
    main()

我正在使用Windows 7,任何想法/建议将不胜感激!我之前问过这个,但没有一个答案是有效的解决方案。谢谢。

0 个答案:

没有答案