无法以CSV格式剥离和存储某些文件的内容

时间:2017-02-17 02:50:14

标签: python python-2.7 os.walk

我有一个看起来像的文件:

将它们放在

class Y(object):
    def abc(self):
        print "calling abc from Y"
        super(Y, self).abc()

class Z(object):
    def abc(self):
        print "calling abc from Z"

class YZ(Y, Z):  # multiple inheritance
    pass

c = YZ()
c.abc()
# prints "calling abc from Y" and then
# prints "calling abc from Z"

我写道:

~/ansible-environments/aws/random_name_1/inventory/group_vars/all 
~/ansible-environments/aws/random_name_2/inventory/group_vars/all
~/ansible-environments/aws/random_name_3/inventory/group_vars/all

我只需要以下列格式的LMIT,isv_alias产品:

    import os
import sys
rootdir='/home/USER/ansible-environments/aws'
#print "aa"
for root, subdirs, files in os.walk(rootdir):
    for subdir in subdirs:
        all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
        if not os.path.isfile(all_path):
            continue
        try:
            with open(all_path, "r") as f:
                all_content = f.readlines()
        except (OSError, IOError):
            continue  # ignore errors
        csv_line = [""] * 3
        for line in all_content:
            if line[:9] == "isv_alias:":
                csv_line[0] = line[7:].strip()
            elif line[:21] == "LMID:":
                csv_line[1] = line[6:].strip()
            elif line[:17] == "products:":
                csv_line[2] = line[10:].strip()
        if all(value != "" for value in csv_line):
            with open(os.path.join("/home/nsingh/nishlist.csv"), "a") as csv:
                csv.write(",".join(csv_line))
                csv.write("\n")

2 个答案:

答案 0 :(得分:1)

这里有三个问题:

  1. 查找所有键值文件
  2. 从每个文件中提取键和值
  3. 将每个文件中的键和值转换为CSV格式的行
  4. 首先使用os.listdir()查找内容 ~/ansible-environments/aws,然后构建预期的路径 每个使用中的inventory/group_vars目录 os.path.join(),看看哪些实际存在。然后列出 那些存在的目录的内容,并假设全部 里面的文件(例如all)是键值文件。这个例子 本答案末尾的代码假定所有文件都可以 发现了这种方式;如果他们不能,你可能必须适应这个例子 使用os.walk()或其他方法查找文件的代码。

    每个键值文件都是一系列行,其中每一行都是一个键 和冒号(":")分隔的值。您使用搜索的方法 对于子字符串(运算符in)将失败,例如,密钥 包含字符串“LMIT”。相反,在冒号处拆分线。 表达式line.split(":", 1)在第一行分割该行 冒号,但不是随后的冒号,如果值本身有一个 结肠。然后从键和值中去掉多余的空格, 并建立一个键和值的字典。

    现在选择您要保留的密钥。一旦你解析了每一个 文件,从中查找字典中的关联值 文件,并从中构建一个列表。然后添加值列表 从此文件到所有文件的值列表列表,以及 使用csv.writer将列表列表写为CSV文件。

    它可能看起来像这样:

    #!/usr/bin/env python2
    from __future__ import with_statement, print_function, division
    import os
    import csv
    
    def read_kv_file(filename):
        items = {}
        with open(filename, "rU") as infp:
            for line in infp:
                # Split at a colon and strip leading and trailing space
                line = [x.strip() for x in line.split(":", 1)]
    
                # Add the key and value to the dictionary
                if len(line) > 1:
                    items[line[0]] = line[1]
        return items
    
    # First find all random names
    outer_dir = os.path.expanduser("~/ansible-environments/aws")
    random_names = os.listdir(outer_dir)
    inner_dirs = [
        os.path.join(outer_dir, name, "inventory/group_vars")
        for name in random_names
    ]
    
    # Now filter it to those directories that actually exist
    inner_dirs = [name for name in inner_dirs if os.path.isdir(name)]
    
    wanted_keys = ["alias", "LMIT", "products"]
    out_columns = ["alias", "LMIT", "product"]
    
    # Collect key-value pairs from all files in these folders
    rows = []
    for dirname in inner_dirs:
        for filename in os.listdir(dirname):
            path = os.path.join(dirname, filename)
    
            # Skip non-files in this directory
            if not os.path.isfile(path):
                continue
    
            # If the file has a non-blank value for any of the keys of
            # interest, add a row
            items = read_kv_file(path)
            this_file_values = [items.get(key) for key in wanted_keys]
            if any(this_file_values):
                rows.append(this_file_values)
    
    # And write them out
    with open("out.csv", "wb") as outfp:
        writer = csv.writer(outfp, "excel")
        writer.writerow(out_columns)
        writer.writerows(rows)
    

答案 1 :(得分:0)

您没有指定如何获取文件(第一行中的f),但假设您已经整理了文件遍历并且文件完全符合你呈现它们(所以没有额外的空格或类似的东西),你可以修改你的代码:

csv_line = [""] * 3
for line in f:
    if line[:6] == "alias:":
        csv_line[0] = line[7:].strip()
    elif line[:5] == "LMIT:":
        csv_line[1] = line[6:].strip()
    elif line[:9] == "products:":
        csv_line[2] = line[10:].strip()
with open(rootdir + '/' + 'list.csv', "a") as csv:
    csv.write(",".join(csv_line))
    csv.write("\n")

这会在CSV中为每个加载为f的文件添加一个包含相应变量的新行,但请记住,它不会检查数据的有效性,因此会很高兴如果打开的文件没有包含正确的数据,则写入空的新行。

您可以在打开csv文件进行写入之前检查all(value != "" for value in csv_line)来防止这种情况。如果要编写至少填充了一个变量的条目,则可以使用any而不是all

更新:您刚粘贴的代码存在严重的缩进和结构问题。它至少对你想做的事情更有意义 - 假设其他一切都没问题,这应该做到:

for root, subdirs, files in os.walk(rootdir):
    for subdir in subdirs:
        all_path = os.path.join(rootdir, subdir, "inventory", "group_vars", "all")
        if not os.path.isfile(all_path):
            continue
        try:
            with open(all_path, "r") as f:
                all_content = f.readlines()
        except (OSError, IOError):
            continue  # ignore errors
        csv_line = [""] * 3
        for line in all_content:
            if line[:6] == "alias:":
                csv_line[0] = line[7:].strip()
            elif line[:5] == "LMIT:":
                csv_line[1] = line[6:].strip()
            elif line[:9] == "products:":
                csv_line[2] = line[10:].strip()
        if all(value != "" for value in csv_line):
            with open(os.path.join(rootdir, "list.csv"), "a") as csv:
                csv.write(",".join(csv_line))
                csv.write("\n")