我正在尝试将分号分隔文件转换为嵌套字典。今天早上一直在研究这个问题并猜测我忽略了一些简单的事情:
这实际上大约有200行。只是一个小样本。
key;name;desc;category;type;action;range;duration;skill;strain_mod;apt_bonus
ambiencesense;Ambience Sense;This sleight provides the async with an instinctive sense about an area and any potential threats nearby. The async receives a +10 modifier to all Investigation, Perception, Scrounging, and Surprise Tests.;psi-chi;passive;automatic;self;constant;;0;
cogboost;Cognitive Boost;The async can temporarily elevate their cognitive performance.;psi-chi;active;quick;self;temp;;-1;{'COG': 5}
[['key',
'name',
'desc',
'category',
'type',
'action',
'range',
'duration',
'skill',
'strain_mod',
'apt_bonus'],
['ambiencesense',
'Ambience Sense',
'This sleight provides the async with an instinctive sense about an area and any potential threats nearby. The async receives a +10 modifier to all Investigation, Perception, Scrounging, and Surprise Tests.',
'psi-chi',
'passive',
'automatic',
'self',
'constant',
'',
'0',
''],
['cogboost',
'Cognitive Boost',
'The async can temporarily elevate their cognitive performance.',
'psi-chi',
'active',
'quick',
'self',
'temp',
'',
'-1',
"{'COG': 5}"]]
blahblah = {
'ambiencesense': {
'name': 'Ambiance Sense'
'desc': 'This sleight provides the async with an instinctive sense about an area and any potential threats nearby. The async receives a +10 modifier to all Investigation, Perception, Scrounging, and Surprise Tests.',
'category': 'psi-chi',
'type': 'passive',
'action': 'automatic',
'range': 'self',
'duration': 'constant',
'skill': '',
'strain_mod': '0',
'apt_bonus': '',
},
'cogboost': {
'name': 'Cognitive Boost'
'desc': 'The async can temporarily elevate their cognitive performance.',
'category': 'psi-chi',
'type': 'active',
'action': 'quick',
'range': 'self',
'duration': 'temp',
'skill': '',
'strain_mod': '-1',
'apt_bonus': 'COG', 5',
},
...
#!/usr/bin/env python
# Usage: ./csvdict.py <filename to convert to dict> <file to output>
import csv
import sys
import pprint
def parse(filename):
with open(filename, 'rb') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(), delimiters=';')
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
dict_list = []
for line in reader:
dict_list.append(line)
return dict_list
new_dict = {}
for item in dict_list:
key = item.pop('key')
new_dict[key] = item
output = parse(sys.argv[1])
with open(sys.argv[2], 'wt') as out:
pprint.pprint(output, stream=out)
#!/usr/bin/env python
# Usage: ./csvdict.py <input filename> <output filename>
import sys
import pprint
file_name = sys.argv[1]
data = {}
error = 'Incorrect number of arguments.\nUsage: ./csvdict.py <input filename> <output filename>'
if len(sys.argv) != 3:
print(error)
else:
with open(file_name, 'r') as test_fh:
header_line = next(test_fh)
header_line = header_line.strip()
headers = header_line.split(';')
index_headers = {index:header for index, header in enumerate(headers)}
for line in test_fh:
line = line.strip()
values = line.split(';')
index_vals = {index:val for index, val in enumerate(values)}
data[index_vals[0]] = {index_headers[key]:value for key, value in index_vals.items() if key != 0}
with open(sys.argv[2], 'wt') as out:
pprint.pprint(data, stream=out)
唯一不能很好处理的是嵌入式词条。任何想法如何清理这个? (见apt_bonus)
'cogboost': {'action': 'quick',
'apt_bonus': "{'COG': 5}",
'category': 'psi-chi',
'desc': 'The async can temporarily elevate their cognitive performance.',
'duration': 'temp',
'name': 'Cognitive Boost',
'range': 'self',
'skill': '',
'strain_mod': '-1',
'type': 'active'},
答案 0 :(得分:2)
这是另一个版本,它有点抽象,但没有依赖性。
file_name = "<path>/test.txt"
data = {}
with open(file_name, 'r') as test_fh:
header_line = next(test_fh)
header_line = header_line.strip()
headers = header_line.split(';')
index_headers = {index:header for index, header in enumerate(headers)}
for line in test_fh:
line = line.strip()
values = line.split(';')
index_vals = {index:val for index, val in enumerate(values)}
data[index_vals[0]] = {index_headers[key]:value for key, value in index_vals.items() if key != 0}
print(data)
答案 1 :(得分:1)
使用pandas
非常容易:
In [7]: import pandas as pd
In [8]: pd.read_clipboard(sep=";", index_col=0).T.to_dict()
Out[8]:
{'ambiencesense': {'action': 'automatic',
'apt_bonus': nan,
'category': 'psi-chi',
'desc': 'This sleight provides the async with an instinctive sense about an area and any potential threats nearby. The async receives a +10 modifier to all Investigation, Perception, Scrounging, and Surprise Tests.',
'duration': 'constant',
'name': 'Ambience Sense',
'range': 'self',
'skill': nan,
'strain_mod': 0,
'type': 'passive'},
'cogboost': {'action': 'quick',
'apt_bonus': "{'COG': 5}",
'category': 'psi-chi',
'desc': 'The async can temporarily elevate their cognitive performance.',
'duration': 'temp',
'name': 'Cognitive Boost',
'range': 'self',
'skill': nan,
'strain_mod': -1,
'type': 'active'}}
在您的情况下,您使用的是pd.read_csv()
而不是.read_clipboard()
,但它看起来大致相同。如果要将apt_bonus
列解析为字典,可能还需要稍微调整一下。
答案 2 :(得分:1)
尝试使用没有库的pythonic方式:
s = '''key;name;desc;category;type;action;range;duration;skill;strain_mod;apt_bonus
ambiencesense;Ambience Sense;This sleight provides the async with an instinctive sense about an area and any potential threats nearby. The async receives a +10 modifier to all Investigation, Perception, Scrounging, and Surprise Tests.;psi-chi;passive;automatic;self;constant;;0;
cogboost;Cognitive Boost;The async can temporarily elevate their cognitive performance.;psi-chi;active;quick;self;temp;;-1;{'COG': 5}'''
lists = [delim.split(';') for delim in s.split('\n')]
keyIndex = lists[0].index('key')
nested = {lst[keyIndex]:{lists[0][i]:lst[i] for i in range(len(lists[0])) if i != keyIndex} for lst in lists[1:]}
结果与:
{
'cogboost': {
'category': 'psi-chi',
'name': 'Cognitive Boost',
'strain_mod': '-1',
'duration': 'temp',
'range': 'self',
'apt_bonus': "{'COG': 5}",
'action': 'quick',
'skill': '',
'type': 'active',
'desc': 'The async can temporarily elevate their cognitive performance.'
},
'ambiencesense': {
'category': 'psi-chi',
'name': 'Ambience Sense',
'strain_mod': '0',
'duration': 'constant',
'range': 'self',
'apt_bonus': '',
'action': 'automatic',
'skill': '',
'type': 'passive',
'desc': 'This sleight provides the async with an instinctive sense about an area and any potential threats nearby. The async receives a +10 modifier to all Investigation, Perception, Scrounging, and Surprise Tests.'
}
}