将值加载到列表中的嵌套dicts

时间:2018-01-11 10:53:56

标签: python csv dictionary nested mapping

我有一个CSV文件,其标题如下:

cpus/0/compatible   clocks/HSE/compatible   ../frequency    memories/flash/compatible   ../address  ../size [and so on...]

我能够将该标题解析为嵌套字典,如下所示:

{'clocks': {'HSE': {'compatible': '[1]',
                    'frequency': '[2]'}},
 'cpus': {'0': {'compatible': '[0]'}},
 'memories': {'bkpsram': {'address': '[13]',
                          'compatible': '[12]',
                          'size': '[14]'},
              'ccm': {'address': '[7]',
                      'compatible': '[6]',
                      'size': '[8]'},
              'flash': {'address': '[4]',
                        'compatible': '[3]',
                        'size': '[5]'},
              'sram': {'address': '[10]',
                       'compatible': '[9]',
                       'size': '[11]'}},
 'pin-controller': {'GPIOA': {'enabled': '[16]'},
                    'GPIOB': {'enabled': '[17]'},
                    'GPIOC': {'enabled': '[18]'},
                    'GPIOD': {'enabled': '[19]'},
                    'GPIOE': {'enabled': '[20]'},
                    'GPIOF': {'enabled': '[21]'},
                    'GPIOG': {'enabled': '[22]'},
                    'GPIOH': {'enabled': '[23]'},
                    'GPIOI': {'enabled': '[24]'},
                    'GPIOJ': {'enabled': '[25]'},
                    'GPIOK': {'enabled': '[26]'},
                    'compatible': '[15]'}}

(它是一个dict对象,打印有pprint()

看起来像'[<number>]'的键值反映了CSV文件中应从中加载数据的列索引。

由于我主要使用C / C ++,我真的很想在Python中使用指针/引用,因为我只需要在每个值中放置一个指向列表元素的指针,并且每行我都可以修改列表内容,但我认为在Python中,没有办法轻易获得这样的行为。

所以现在我打算将这个字典转储成一个字符串并连续执行以下3个修改:

  • {替换为{{
  • }替换为}}
  • '[<number>]'替换为{<number>}

之后我将能够'#34;加载&#34;数据类似于ast.literal_eval(dictAsStr.format(*rowFromCsv)),但将整个字典转换为字符串然后再转换为字典似乎是浪费时间......

我在这里错过了一些其他明显的解决方案吗? CSV的格式和我加载标题的方式并不固定,我可能会轻易改变,但我真的想要一个解决方案,不能归结为&#34;递归访问每个密钥并加载适当的值从当前行手动&#34;

从CSV文件中我将每一行加载为字符串列表,例如:

['["ARM,Cortex-M4", "ARM,ARMv7-M"]',
 '["ST,STM32-HSE", "fixed-clock"]',
 '0',
 '["on-chip-flash"]',
 '0x8000000',
 '131072',
 '',
 '',
 '',
 '["on-chip-ram"]',
 '0x20000000',
 '65536',
 '',
 '',
 '',
 '["ST,STM32-GPIOv2-pin-controller"]',
 'False',
 'False',
 'False',
 '',
 '',
 '',
 '',
 'False',
 '',
 '',
 '']

现在我想将每个加载的行(字符串列表)中的值插入到嵌套字典中的相应键中,因此按照上面的示例我想得到:

{'clocks': {'HSE': {'compatible': '["ST,STM32-HSE", "fixed-clock"]',
                    'frequency': '0'}},
 'cpus': {'0': {'compatible': '["ARM,Cortex-M4", "ARM,ARMv7-M"]'}},
 'memories': {'bkpsram': {'address': '',
                          'compatible': '',
                          'size': ''},
              'ccm': {'address': '',
                      'compatible': '',
                      'size': ''},
              'flash': {'address': '0x8000000',
                        'compatible': '["on-chip-flash"]',
                        'size': '131072'},
              'sram': {'address': '0x20000000',
                       'compatible': '["on-chip-ram"]',
                       'size': '65536'}},
 'pin-controller': {'GPIOA': {'enabled': 'False'},
                    'GPIOB': {'enabled': 'False'},
                    'GPIOC': {'enabled': 'False'},
                    'GPIOD': {'enabled': ''},
                    'GPIOE': {'enabled': ''},
                    'GPIOF': {'enabled': ''},
                    'GPIOG': {'enabled': ''},
                    'GPIOH': {'enabled': 'False'},
                    'GPIOI': {'enabled': ''},
                    'GPIOJ': {'enabled': ''},
                    'GPIOK': {'enabled': ''},
                    'compatible': '["ST,STM32-GPIOv2-pin-controller"]'}}

为了完整起见,这里是我要加载的CSV文件中的几行。第一列不是上面提供的字典的一部分,因为它用于索引。

chip,cpus/0/compatible,clocks/HSE/compatible,../frequency,memories/flash/compatible,../address,../size,memories/ccm/compatible,../address,../size,memories/sram/compatible,../address,../size,memories/bkpsram/compatible,../address,../size,pin-controller/compatible,pin-controller/GPIOA/enabled,pin-controller/GPIOB/enabled,pin-controller/GPIOC/enabled,pin-controller/GPIOD/enabled,pin-controller/GPIOE/enabled,pin-controller/GPIOF/enabled,pin-controller/GPIOG/enabled,pin-controller/GPIOH/enabled,pin-controller/GPIOI/enabled,pin-controller/GPIOJ/enabled,pin-controller/GPIOK/enabled
STM32F401CB,"[""ARM,Cortex-M4"", ""ARM,ARMv7-M""]","[""ST,STM32-HSE"", ""fixed-clock""]",0,"[""on-chip-flash""]",0x8000000,131072,,,,"[""on-chip-ram""]",0x20000000,65536,,,,"[""ST,STM32-GPIOv2-pin-controller""]",False,False,False,,,,,False,,,
STM32F401CC,"[""ARM,Cortex-M4"", ""ARM,ARMv7-M""]","[""ST,STM32-HSE"", ""fixed-clock""]",0,"[""on-chip-flash""]",0x8000000,262144,,,,"[""on-chip-ram""]",0x20000000,65536,,,,"[""ST,STM32-GPIOv2-pin-controller""]",False,False,False,,,,,False,,,
STM32F401CD,"[""ARM,Cortex-M4"", ""ARM,ARMv7-M""]","[""ST,STM32-HSE"", ""fixed-clock""]",0,"[""on-chip-flash""]",0x8000000,393216,,,,"[""on-chip-ram""]",0x20000000,98304,,,,"[""ST,STM32-GPIOv2-pin-controller""]",False,False,False,,,,,False,,,

用于解析标题的代码:

import csv

with open("some-path-to-CSV-file") as csvFile:
    csvReader = csv.reader(csvFile)
    header = next(csvReader)
    previousKeyElements = header[1].split('/')
    dictionary = {}
    for index, key in enumerate(header[1:]):
        keyElements = key.split('/')
        i = 0
        while keyElements[i] == '..':
            i += 1
        keyElements[0:i] = previousKeyElements[0:-i]
        previousKeyElements = keyElements
        node = dictionary
        for keyElement in keyElements[:-1]:
            node = node.setdefault(keyElement, {})
        node[keyElements[-1]] = '[{}]'.format(index)

1 个答案:

答案 0 :(得分:1)

如何将实际的行索引(作为整数)用作&#34;解析的&#34;中的值。标题,即:

{'clocks': {'HSE': {'compatible': 1,
                'frequency': 2}},
# etc

然后在解析的标题副本上使用递归从行值填充它:

import csv
import sys
import copy
import pprint

def parse_header(header):
    previousKeyElements = header[1].split('/')
    dictionary = {}
    for index, key in enumerate(header[1:]):
        keyElements = key.split('/')
        i = 0
        while keyElements[i] == '..':
            i += 1
        keyElements[0:i] = previousKeyElements[0:-i]
        previousKeyElements = keyElements
        node = dictionary
        for keyElement in keyElements[:-1]:
            node = node.setdefault(keyElement, {})
        node[keyElements[-1]] = index
    return dictionary

def _rparse(d, k, v, row):
    if isinstance(v, dict):
        for subk, subv in v.items():
            _rparse(v, subk, subv, row)
    elif isinstance(v, int):
        d[k] = row[v]
    else:
        raise ValueError("'v' should be either a dict or an int (got : %s(%s))" % (type(v), v))


def parse_row(header, row):
    struct = copy.deepcopy(header)
    for k, v in struct.items():
        _rparse(struct, k, v, row)
    return struct

def main(*args):
    path = args[0]
    with open(path) as f:
        reader = csv.reader(f)
        header = parse_header(next(reader))
        results = [parse_row(header, row[1:]) for row in reader]

    pprint.pprint(results)


if __name__ == "__main__":
    main(*sys.argv[1:])

另一个解决方案(可能实际上更快)将构建一个反向映射,其中行索引作为键和字典&#34;路径&#34;作为价值观:

{0: ("cpus", "0", "compatible"),
 1: ("clocks", "HSE", "compatible"),
 2: ("clocks", "HSE", "frequency"),
 # etc
}

然后:

def parse_row(template, map, row):
   # 'template' is your parsed header dict
   struct = copy.deepcopy(template)
   target = struct  
   for index, path in map.items():
       for key in path[:-1]:
           target = target[key]
       target[key[-1] = row[index]

哦,是的,作为额外的奖励,您可能希望使用ast.literal_eval()将您的值转换为适当的python类型:

>>> import ast
>>> ast.literal_eval("False")
False
>>> ast.literal_eval('["on-chip-flash"]')
['on-chip-flash']
>>> ast.literal_eval('0x8000000')
134217728
>>> ast.literal_eval('["ARM,Cortex-M4", "ARM,ARMv7-M"]')
['ARM,Cortex-M4', 'ARM,ARMv7-M']
>>> ast.literal_eval("this should fail")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/ast.py", line 49, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/usr/lib/python2.7/ast.py", line 37, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 1
    this should fail
              ^
SyntaxError: invalid syntax


>>> def to_python(value):
...     try:
...         return ast.literal_eval(value)
...     except Exception as e:
...         return value
... 
>>> to_python('["on-chip-flash"]')
['on-chip-flash']
>>> to_python('wtf')
'wtf'
>>>