我有一个CSV文件,其标题如下:
cpus/0/compatible clocks/HSE/compatible ../frequency memories/flash/compatible ../address ../size [and so on...]
我能够将该标题解析为嵌套字典,如下所示:
{'clocks': {'HSE': {'compatible': '[1]',
'frequency': '[2]'}},
'cpus': {'0': {'compatible': '[0]'}},
'memories': {'bkpsram': {'address': '[13]',
'compatible': '[12]',
'size': '[14]'},
'ccm': {'address': '[7]',
'compatible': '[6]',
'size': '[8]'},
'flash': {'address': '[4]',
'compatible': '[3]',
'size': '[5]'},
'sram': {'address': '[10]',
'compatible': '[9]',
'size': '[11]'}},
'pin-controller': {'GPIOA': {'enabled': '[16]'},
'GPIOB': {'enabled': '[17]'},
'GPIOC': {'enabled': '[18]'},
'GPIOD': {'enabled': '[19]'},
'GPIOE': {'enabled': '[20]'},
'GPIOF': {'enabled': '[21]'},
'GPIOG': {'enabled': '[22]'},
'GPIOH': {'enabled': '[23]'},
'GPIOI': {'enabled': '[24]'},
'GPIOJ': {'enabled': '[25]'},
'GPIOK': {'enabled': '[26]'},
'compatible': '[15]'}}
(它是一个dict
对象,打印有pprint()
)
看起来像'[<number>]'
的键值反映了CSV文件中应从中加载数据的列索引。
由于我主要使用C / C ++,我真的很想在Python中使用指针/引用,因为我只需要在每个值中放置一个指向列表元素的指针,并且每行我都可以修改列表内容,但我认为在Python中,没有办法轻易获得这样的行为。
所以现在我打算将这个字典转储成一个字符串并连续执行以下3个修改:
{
替换为{{
,}
替换为}}
,'[<number>]'
替换为{<number>}
。之后我将能够'#34;加载&#34;数据类似于ast.literal_eval(dictAsStr.format(*rowFromCsv))
,但将整个字典转换为字符串然后再转换为字典似乎是浪费时间......
我在这里错过了一些其他明显的解决方案吗? CSV的格式和我加载标题的方式并不固定,我可能会轻易改变,但我真的想要一个解决方案,不能归结为&#34;递归访问每个密钥并加载适当的值从当前行手动&#34; 。
从CSV文件中我将每一行加载为字符串列表,例如:
['["ARM,Cortex-M4", "ARM,ARMv7-M"]',
'["ST,STM32-HSE", "fixed-clock"]',
'0',
'["on-chip-flash"]',
'0x8000000',
'131072',
'',
'',
'',
'["on-chip-ram"]',
'0x20000000',
'65536',
'',
'',
'',
'["ST,STM32-GPIOv2-pin-controller"]',
'False',
'False',
'False',
'',
'',
'',
'',
'False',
'',
'',
'']
现在我想将每个加载的行(字符串列表)中的值插入到嵌套字典中的相应键中,因此按照上面的示例我想得到:
{'clocks': {'HSE': {'compatible': '["ST,STM32-HSE", "fixed-clock"]',
'frequency': '0'}},
'cpus': {'0': {'compatible': '["ARM,Cortex-M4", "ARM,ARMv7-M"]'}},
'memories': {'bkpsram': {'address': '',
'compatible': '',
'size': ''},
'ccm': {'address': '',
'compatible': '',
'size': ''},
'flash': {'address': '0x8000000',
'compatible': '["on-chip-flash"]',
'size': '131072'},
'sram': {'address': '0x20000000',
'compatible': '["on-chip-ram"]',
'size': '65536'}},
'pin-controller': {'GPIOA': {'enabled': 'False'},
'GPIOB': {'enabled': 'False'},
'GPIOC': {'enabled': 'False'},
'GPIOD': {'enabled': ''},
'GPIOE': {'enabled': ''},
'GPIOF': {'enabled': ''},
'GPIOG': {'enabled': ''},
'GPIOH': {'enabled': 'False'},
'GPIOI': {'enabled': ''},
'GPIOJ': {'enabled': ''},
'GPIOK': {'enabled': ''},
'compatible': '["ST,STM32-GPIOv2-pin-controller"]'}}
为了完整起见,这里是我要加载的CSV文件中的几行。第一列不是上面提供的字典的一部分,因为它用于索引。
chip,cpus/0/compatible,clocks/HSE/compatible,../frequency,memories/flash/compatible,../address,../size,memories/ccm/compatible,../address,../size,memories/sram/compatible,../address,../size,memories/bkpsram/compatible,../address,../size,pin-controller/compatible,pin-controller/GPIOA/enabled,pin-controller/GPIOB/enabled,pin-controller/GPIOC/enabled,pin-controller/GPIOD/enabled,pin-controller/GPIOE/enabled,pin-controller/GPIOF/enabled,pin-controller/GPIOG/enabled,pin-controller/GPIOH/enabled,pin-controller/GPIOI/enabled,pin-controller/GPIOJ/enabled,pin-controller/GPIOK/enabled
STM32F401CB,"[""ARM,Cortex-M4"", ""ARM,ARMv7-M""]","[""ST,STM32-HSE"", ""fixed-clock""]",0,"[""on-chip-flash""]",0x8000000,131072,,,,"[""on-chip-ram""]",0x20000000,65536,,,,"[""ST,STM32-GPIOv2-pin-controller""]",False,False,False,,,,,False,,,
STM32F401CC,"[""ARM,Cortex-M4"", ""ARM,ARMv7-M""]","[""ST,STM32-HSE"", ""fixed-clock""]",0,"[""on-chip-flash""]",0x8000000,262144,,,,"[""on-chip-ram""]",0x20000000,65536,,,,"[""ST,STM32-GPIOv2-pin-controller""]",False,False,False,,,,,False,,,
STM32F401CD,"[""ARM,Cortex-M4"", ""ARM,ARMv7-M""]","[""ST,STM32-HSE"", ""fixed-clock""]",0,"[""on-chip-flash""]",0x8000000,393216,,,,"[""on-chip-ram""]",0x20000000,98304,,,,"[""ST,STM32-GPIOv2-pin-controller""]",False,False,False,,,,,False,,,
用于解析标题的代码:
import csv
with open("some-path-to-CSV-file") as csvFile:
csvReader = csv.reader(csvFile)
header = next(csvReader)
previousKeyElements = header[1].split('/')
dictionary = {}
for index, key in enumerate(header[1:]):
keyElements = key.split('/')
i = 0
while keyElements[i] == '..':
i += 1
keyElements[0:i] = previousKeyElements[0:-i]
previousKeyElements = keyElements
node = dictionary
for keyElement in keyElements[:-1]:
node = node.setdefault(keyElement, {})
node[keyElements[-1]] = '[{}]'.format(index)
答案 0 :(得分:1)
如何将实际的行索引(作为整数)用作&#34;解析的&#34;中的值。标题,即:
{'clocks': {'HSE': {'compatible': 1,
'frequency': 2}},
# etc
然后在解析的标题副本上使用递归从行值填充它:
import csv
import sys
import copy
import pprint
def parse_header(header):
previousKeyElements = header[1].split('/')
dictionary = {}
for index, key in enumerate(header[1:]):
keyElements = key.split('/')
i = 0
while keyElements[i] == '..':
i += 1
keyElements[0:i] = previousKeyElements[0:-i]
previousKeyElements = keyElements
node = dictionary
for keyElement in keyElements[:-1]:
node = node.setdefault(keyElement, {})
node[keyElements[-1]] = index
return dictionary
def _rparse(d, k, v, row):
if isinstance(v, dict):
for subk, subv in v.items():
_rparse(v, subk, subv, row)
elif isinstance(v, int):
d[k] = row[v]
else:
raise ValueError("'v' should be either a dict or an int (got : %s(%s))" % (type(v), v))
def parse_row(header, row):
struct = copy.deepcopy(header)
for k, v in struct.items():
_rparse(struct, k, v, row)
return struct
def main(*args):
path = args[0]
with open(path) as f:
reader = csv.reader(f)
header = parse_header(next(reader))
results = [parse_row(header, row[1:]) for row in reader]
pprint.pprint(results)
if __name__ == "__main__":
main(*sys.argv[1:])
另一个解决方案(可能实际上更快)将构建一个反向映射,其中行索引作为键和字典&#34;路径&#34;作为价值观:
{0: ("cpus", "0", "compatible"),
1: ("clocks", "HSE", "compatible"),
2: ("clocks", "HSE", "frequency"),
# etc
}
然后:
def parse_row(template, map, row):
# 'template' is your parsed header dict
struct = copy.deepcopy(template)
target = struct
for index, path in map.items():
for key in path[:-1]:
target = target[key]
target[key[-1] = row[index]
哦,是的,作为额外的奖励,您可能希望使用ast.literal_eval()
将您的值转换为适当的python类型:
>>> import ast
>>> ast.literal_eval("False")
False
>>> ast.literal_eval('["on-chip-flash"]')
['on-chip-flash']
>>> ast.literal_eval('0x8000000')
134217728
>>> ast.literal_eval('["ARM,Cortex-M4", "ARM,ARMv7-M"]')
['ARM,Cortex-M4', 'ARM,ARMv7-M']
>>> ast.literal_eval("this should fail")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/ast.py", line 49, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/usr/lib/python2.7/ast.py", line 37, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
this should fail
^
SyntaxError: invalid syntax
>>> def to_python(value):
... try:
... return ast.literal_eval(value)
... except Exception as e:
... return value
...
>>> to_python('["on-chip-flash"]')
['on-chip-flash']
>>> to_python('wtf')
'wtf'
>>>