如何使用Python删除文本文件中的随机新行?

时间:2018-02-12 18:59:41

标签: python parsing

我目前在文本文件中有很多变体:

2   /vol/vol0/home  /vol/vol0/home  CifsPerm: 0, CifsType: 0, Remark: 
Up-level share detected.    0   NFS /vol/vol0/home  ntap

但需要它在一行中,如下所示:

2   /vol/vol0/home  /vol/vol0/home  CifsPerm: 0, CifsType: 0, Remark: Up-level share detected.  0   NFS /vol/vol0/home  ntap

出于某种原因,所有突然的新行都以“向上”开始。我该怎么做呢?我对Python非常陌生,对于如何解决这个问题我会感激不尽。

我的代码(出于某种原因将所有内容都放在同一行中):

def bad_line_2(line): 
    if "Up-level" in line:
        return True
    else:
        return False

def inventory_append(in_file_2, out_fixed_inv): 
    try:
        in_fp_2 = open(in_file_2, "r")

    except IOError as e:
        print("error opening {} for reading: {}".format(in_file_2, str(e)))

    else:
        try:
            out_inv = open(out_fixed_inv, "w")

        except IOError as e:
            print("error opening {} for writing: {}".format(out_fixed_inv, str(e)))
        else:
            with open(in_file_2) as f:
                out = integer.join(line.rstrip('\n') for line in f
                out_inv.write(out)

def main():
    in_file_2 = "Inventory 2017-12-21.txt"
    out_fixed_inv = "fixed_inv.txt" 
    inventory_append(in_file_2, out_fixed_inv)

if __name__ == "__main__":
    main()

文本文件示例:

2   usfptotnap101a  C$  \vol\vol0   232 8   0   4474560 Share   3   0   CifsPerm: 0, CifsType: 2147483648, Remark: Remote Administration    0   CIFS & NFS  /vol/vol0   ntap
2   usfptotnap101a  ETC$    \vol\vol0\etc   508 1   0   4474561 Share   1   1   CifsPerm: 0, CifsType: 2147483648, Remark: Remote Administration
Up-level share detected.    0   CIFS    NULL    ntap
2   usfptotnap101a  Varonis$    \vol\it_tot101a_181099\Varonis  7159534 44  4   4474551 Share   1   1   CifsPerm: 0, CifsType: 0, Remark: 
Up-level share detected.    0   CIFS    NULL    ntap
2   usfptotnap101a  smtest  \vol\smtest\smtree  7715986 1   0   4474559 Share   1   0   CifsPerm: 0, CifsType: 0, Remark:   0   CIFS    NULL    ntap

1 个答案:

答案 0 :(得分:1)

既然您说过给定示例的变体,这里可能会对您有所帮助:

results = []

with open(YOUR_FILE, 'r') as f:
    lines = f.readlines()
    results = ['%s%s\n' % (i.strip(), j.strip()) for i, j in zip(lines[::2], lines[1::2])]

with open(OUTPUT_FILE, 'w') as f:
    f.writelines(results)

现在,这是做什么的?它会成对第1行和第2行,第3行和第4行,依此类推,然后将它们连接起来。如果您拥有的文件不是很大,这是一个很好的解决方案。如果是,还有其他解决方案。为什么它不是大文件的好解决方案的原因是因为它使用readlines同时读取所有行。

编辑1。

上面的代码似乎不合适,因为线条没有按假设模式分割。

以您使用的文件和函数为例,以下是更新版本的样子:

def read_lines(file_handle):
    """
    :param file file_handle:
    :return: List of parsed lines
    :rtype list[str]
    """
    results = [] # correctly parsed lines
    buffer = ""
    for line in file_handle:
        if not line.startswith('Up-level'):
            if buffer:
                # you have loaded previous line in buffer
                # and current line doesn't start with 'Up-level'
                # this means it's just another regular line and
                # you have to 'flush' the buffer in results
                results.append(buffer + '\n')
            # whether buffer was filled and you just flushed it
            # or it was empty, you load new line in it.
            buffer = line.strip()
        else:
            # when line starts with 'Up-level' you add it to
            # previously parsed one which should be in buffer
            # when next line is parsed this ends up in results
            buffer += line.strip()
    else:
        # last line, that is also in buffer, is appended to results too
        results.append(line)

    return results

def inventory_append(in_file_2, out_fixed_inv):
    try:
        in_fp_2 = open(in_file_2, "r")

    except IOError as e:
        print("error opening {} for reading: {}".format(in_file_2, str(e)))

    else:
        try:
            out_inv = open(out_fixed_inv, "w")

        except IOError as e:
            print("error opening {} for writing: {}".format(out_fixed_inv, str(e)))
        else:
            with open(in_file_2) as f:
                lines = read_lines(f)
                out_inv.writelines(lines)