这个问题分为两部分:
予。我想将Python字符串解析为字典列表。
****这是Python String ****
../Data.py:92 final computing result as shown below: [historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]
****预期的Python输出:****
{
"data" :[
{
"id": "A(long) 11A"
"startdate": "42521"
"numvaluelist": "0.1065599566767107"
},
{
"id": "A(short) 11B"
"startdate": "42521"
"numvaluelist": "0.0038113334533441123"
},
{
"id": "B(long) 11C"
"startdate": "42521"
"numvaluelist": "20.061623176440904"
}
]
}
II。我需要进一步解析id和numvaluelist的键值。我不确定是否有更好的方法来做到这一点。因此,我将字符串转换为Python字典,循环遍历并进一步解析。如果我推翻解决方案,请指导我。
更新:代码
text = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"
data = text.strip("../Data.py:92 final computing result as shown below: ")
print data
答案 0 :(得分:1)
您输入的原始文本看起来非常可预测,请尝试以下方法:
>>> import re
>>> raw = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"
>>> line_re = re.compile(r'\{[^\}]+\}')
>>> records = line_re.findall(raw)
>>> record_re = re.compile(
... r"""
... id:\s*\'(?P<id>[^']+)\'\s*
... startdate:\s*(?P<startdate>\d+)\s*
... numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
... datelist:\s*(?P<datelist>\d+)\s*
... """,
... re.X
... )
>>> record_parsed = record_re.search(line_re.findall(raw)[0])
>>> record_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'datelist': '42521', 'id': 'A(long) 11A'}
>>> for record in records:
... record_parsed = record_re.search(record)
... # Here is where you would do whatever you need with the fields.
解析id的子元素,例如:
>>> record_re2 = re.compile(
... r"""
... id:\s*\'
... (?P<id_letter>[A-Z]+)
... \(
... (?P<id_type>[^\)]+)
... \)\s*
... (?P<id_codenum>\d+)
... (?P<id_codeletter>[A-Z]+)
... \'\s*
... startdate:\s*(?P<startdate>\d+)\s*
... numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
... datelist:\s*(?P<datelist>\d+)\s*
... """,
... re.X
... )
>>> record2_parsed = record_re2.search(line_re.findall(raw)[0])
>>> record2_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'id_letter': 'A', 'id_codeletter': 'A', 'datelist': '42521', 'id_type': 'long', 'id_codenum': '11'}