如何使用Python将通过列嵌套的CSV文件转换为嵌套字典?

时间:2019-10-05 09:08:12

标签: python

我有一个Google分类表。

[嵌套类别的Google表格] [1]   [1]:https://i.stack.imgur.com/3OAi5.png /我将其导出到一个csv文件中,结果是:

Substructure,,,
,Foundations,,
,,Standard Foundations,
,,,Wall Foundations   
,,,Column Foundations   
,,,Standard Foundation Supplementary Components   
,,Special Foundations,
,,,Driven Piles   
,,,Bored Piles   
,,,Caissons   
,,,Special Foundation Walls   
,,,Foundation Anchors   
,,,Underpinning   
,,,Raft Foundations   
,,,Pile Caps   
,,,Grade Beams

使用Python,我想将此CSV文件转换为以下格式的嵌套字典:

categories = [
    {
      id: 0,
      title: 'parent'
    }, {
      id: 1,
      title: 'parent',
      subs: [
        {
          id: 10,
          title: 'child'
        }, {
          id: 11,
          title: 'child'
        }, {
          id: 12,
          title: 'child'
        }
      ]
    }, {
      id: 2,
      title: 'parent'
    },
    // more data here
];

因此,为了清楚起见,每条csv行都应添加到这样的字典中:{id:x,title:y},如果有子代,则其外观应如下所示:{id:x,title: y,subs:[逗号分隔的儿童字典]}。

我在这里花了大约一天半的时间,使用类似的问题,但是对于我目前的技能水平来说,它们都太过不同了,以至于无法解决这些问题。我感觉很糟糕,非常感谢您的帮助。如果可能的话,我也想在其他情况下使用该解决方案,并使用不同级别的孩子。此示例为孩子提供了三个级别,有些只有两个或一个。

非常感谢您的帮助。

2 个答案:

答案 0 :(得分:0)

递归!

import csv
from pprint import pprint

filename = 'myfile.csv'
with open(filename) as f:
    matrix = list(csv.reader(f))

current_id = -1


def next_id():
    global current_id
    current_id += 1
    return current_id


def group(column, rows):
    if column == len(matrix[0]) - 1:
        return [
            {'id': next_id(), 'title': row[column].strip()}
            for row in rows
        ]

    result = []
    item = None
    sub = None
    for row in rows:
        title = row[column]
        if title:
            if item:
                item['subs'] = group(column + 1, sub)
            item = {'id': next_id(), 'title': title.strip()}
            result.append(item)
            sub = []
        else:
            sub.append(row)
    item['subs'] = group(column + 1, sub)
    return result


pprint(group(0, matrix))

输出:

[{'id': 0,
  'subs': [{'id': 1,
            'subs': [{'id': 2,
                      'subs': [{'id': 3, 'title': 'Wall Foundations'},
                               {'id': 4, 'title': 'Column Foundations'},
                               {'id': 5,
                                'title': 'Standard Foundation Supplementary Components'}],
                      'title': 'Standard Foundations'},
                     {'id': 6,
                      'subs': [{'id': 7, 'title': 'Driven Piles'},
                               {'id': 8, 'title': 'Bored Piles'},
                               {'id': 9, 'title': 'Caissons'},
                               {'id': 10,
                                'title': 'Special Foundation Walls'},
                               {'id': 11, 'title': 'Foundation Anchors'},
                               {'id': 12, 'title': 'Underpinning'},
                               {'id': 13, 'title': 'Raft Foundations'},
                               {'id': 14, 'title': 'Pile Caps'},
                               {'id': 15, 'title': 'Grade Beams'}],
                      'title': 'Special Foundations'}],
            'title': 'Foundations'}],
  'title': 'Substructure'}]

答案 1 :(得分:0)

我相信您要查找的语法如下:

with open('file.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('file_new.csv', mode='w') as outfile:
    writer = csv.writer(outfile)
    mydict = {rows[0]:rows[1] for rows in reader}

或者,对于python <= 2.7.1,您需要:

mydict = dict((rows[0],rows[1]) for rows in reader)