情况:AHLTA是一种电子病历,可以将GUI模板导出为文本。我正在构建模板编辑器,需要导入文本文件。每行代表一个GUI元素,并以一个标识GUI中父标签的数字开头。线条的顺序并不重要。我正在使用Python 3。
示例( theFile ):
1,550,57,730,77,0,32770," |||||||0|0||0|0|||0|||0|0|1|0|0|0|||","F=TimesNewRoman|C=8421504|T=T","Last updated: 2017-05-18"
0,743,4,823,48,0,16384," |||||||0|0||0|0|||0|||0|0|0|0|0|0|||","F=Arial|O=5|B=T","TSWF Navigator:<formLinkInfo><version>1.1</version><templateName>TSWF-Navigator</templateName><templateId>2238487</templateId><templateOwnerName>Department of Defense</templateOwnerName><templateOwnerNcid>33962</templateOwnerNcid></formLinkInfo>"
0,828,4,907,24,0,16384," |||||||0|0||0|0|||0|||0|0|0|0|0|0|||","O=5","CORE:<formLinkInfo><version>1.1</version><templateName>TSWF-CORE</templateName><templateId>1995726</templateId><templateOwnerName>Department of Defense</templateOwnerName><templateOwnerNcid>33962</templateOwnerNcid></formLinkInfo>"
2,25,791,370,811,297285,8961," | || ||||19|80|YCN|0|0|Y|N|0|||0|0|5|0|0|0|||","F=Arial|T=T","Responds to affection~ (by 4 months)"
2,25,871,370,891,297287,8961," | || ||||19|80|YCN|0|0|Y|N|0|||0|0|5|0|0|0|||","F=Arial|T=T","Indicates pleasure and displeasure~ (by 4 months)"
我的目标:我想要一个列表字典,其中键对应于GUI标签号,列表包含以该号开头的所有行。
示例:
0:
0,743,4,823,48,0,16384," |||||||0|0||0|0|||0|||0|0|0|0|0|0|||","F=Arial|O=5|B=T","TSWF Navigator:<formLinkInfo><version>1.1</version><templateName>TSWF-Navigator</templateName><templateId>2238487</templateId><templateOwnerName>Department of Defense</templateOwnerName><templateOwnerNcid>33962</templateOwnerNcid></formLinkInfo>"
0,828,4,907,24,0,16384," |||||||0|0||0|0|||0|||0|0|0|0|0|0|||","O=5","CORE:<formLinkInfo><version>1.1</version><templateName>TSWF-CORE</templateName><templateId>1995726</templateId><templateOwnerName>Department of Defense</templateOwnerName><templateOwnerNcid>33962</templateOwnerNcid></formLinkInfo>"
1:
1,550,57,730,77,0,32770," |||||||0|0||0|0|||0|||0|0|1|0|0|0|||","F=TimesNewRoman|C=8421504|T=T","Last updated: 2017-05-18"
2:
2,25,791,370,811,297285,8961," | || ||||19|80|YCN|0|0|Y|N|0|||0|0|5|0|0|0|||","F=Arial|T=T","Responds to affection~ (by 4 months)"
2,25,871,370,891,297287,8961," | || ||||19|80|YCN|0|0|Y|N|0|||0|0|5|0|0|0|||","F=Arial|T=T","Indicates pleasure and displeasure~ (by 4 months)"
问题:我无法提前创建列表,因为在读取文件之前我不知道有多少个标签。我尝试循环遍历每个选项卡的文件,将该选项卡的项目收集到临时列表中,然后将列表添加到字典中,然后再转到下一个选项卡。为简单起见缩短了示例数据:
theFile = ['1,550,57,730,77', '0,743,4,823,48', '0,828,4,907,24', '2,25,791,370,811', '2,25,871,370,891']
tabCount = 3 # for this example; normally pulled from file header
sortedLines = dict()
for i in range(tabCount):
tempList = []
for line in theFile:
tempList.append(line)
sortedLines.update({tabCount: tempList})
tempList.clear()
print('Dict: ', sortedLines)
for k, v in sortedLines.items():
print('Pair: ' + str(k) + ': ' + '[%s]' % ', '.join(map(str, v)))
这似乎适当地循环,但我最终得到一个空对:
{3: []}
3: []
摘要:如果仅在运行时知道列表数,我该如何创建列表字典?
答案 0 :(得分:1)
def main():
# I'm assuming you can get this far...
lines = [
'1,some stuff 1',
'2,some stuff 2,more stuff',
'2,some stuff 4,candy,bacon',
'3,some stuff 3,this,is,horrible...'
]
# Something to hold your parsed data
data = {}
# Iterate over each line of your file
for line in lines:
# Split the data apart on comma per your example data
parts = line.split(',')
# denote the key is the first part of the split data
key = parts[0]
if key not in data:
# Since there could be multiple values per key we need to keep a
# list of mapped values
data[key] = []
# put the "other data" into the list
index_of_sep = line.find(',')
data[key].append(line[index_of_sep+1:])
# You probably want to return here. I'm printing so you can see the result
print(data)
if __name__ == '__main__':
main()
<强>结果强>
C:\Python35\python.exe C:/Users/Frito/GitSource/sandbox/sample.py
{'3': ['some stuff 3,this,is,horrible...'], '1': ['some stuff 1'], '2': ['some stuff 2,more stuff', 'some stuff 4,candy,bacon']}
Process finished with exit code 0