读取.txt文件并将选择性数据导出到.csv

时间:2016-08-09 15:48:32

标签: python

我正在寻求帮助,我在.txt文件中有来自centos服务器的多路径输出,看起来像这样。

asm (393040300403de) dm-12 HITACHI
size=35G queue_if_no_path
  |- 1:0:0:18  sda  65:48   active ready running
  `- 3:0:0:18  sdbc 70:368  active ready running
3600300300a4c dm-120 HITACHI
size=50G queue_if_no_path
  |- 1:0:0:98  sdc 70:48   active ready running
  `- 3:0:0:98  sdca 131:368 active ready running

导出到.csv文件时应如下所示。

DISKS_NAME  LUN             LUNID DM-NAME SIZE  MULTPATH
asm       393040300403de    03de  dm-12    35G  sda  sdbc
No_device  3600300300a4c    0a4c  dm-120   50G  sdc  sdca

这是我得到的,但这只是读取每一行,并在每次找到空格时将其放入不同的列

import csv

readfile = 'multipath.txt'
writefile = 'data.csv'
with open(readfile,'r') as a, open(writefile, 'w') as b:
    o=csv.writer(b)
    for line in a:
        o.writerow(line.split())

1 个答案:

答案 0 :(得分:0)

假设您只有上述示例中描述的两种类型的条目,您可以将每一行定义为其中将由line.split()分隔的元素数量的因子。例如:

disk_name = ""
... # other parameters you need to keep track of across lines. I'd suggest creating a class for each lun/disk_name.

for line in a:
    line_data = line.split()

    if len(line_data) == 4:
        # this will match and 'asm (393040300403de) dm-12 HITACHI'
        disk_name, lun, dm_name, _ = line_data
        # process these variables accordingly (instantiate a new class member)
        continue # to read the next line

    else if len(line_data) == 3:
        # this will match '3600300300a4c dm-120 HITACHI'
        lun, dm_name, _ = line_data
        disk_name = "No_device"
        # process these variables accordingly
        continue

    if len(line_data) == 2:
        # this will match 'size=35G queue_if_no_path'
        size, _ = line_data
        # process the size accordingly, associate with the disk_name from earlier
        continue

    if len(line_data) == 7:
        # this will match '|- 1:0:0:18  sda  65:48   active ready running' etc.
        _, _, path, _, _, _, _ = line_data
        # process the path accordingly, associate with the disk_name from earlier
        continue

当然,如果该行包含您需要的类型数据,而不仅仅是正确数量的项目,则使用正则表达式将更加灵活。但这应该让你开始。

通过按此顺序处理行,您将始终选择新的disk_name / lun,然后将以下“数据”行分配给该磁盘。当您点击新磁盘时,其后面的行将与新磁盘等相关联。