Python将字符串解析为Python字典列表

时间:2016-09-27 19:37:24

标签: python list loops dictionary

这个问题分为两部分:

予。我想将Python字符串解析为字典列表。

****这是Python String ****

../Data.py:92 final computing result as shown below:  [historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]

****预期的Python输出:****

{
  "data" :[
    {
      "id": "A(long) 11A"
      "startdate": "42521"
      "numvaluelist": "0.1065599566767107"
    },
    {
      "id": "A(short) 11B"
      "startdate": "42521"
      "numvaluelist": "0.0038113334533441123"
    },
    {
      "id": "B(long) 11C"
      "startdate": "42521"
      "numvaluelist": "20.061623176440904"
    }
  ]
}

II。我需要进一步解析id和numvaluelist的键值。我不确定是否有更好的方法来做到这一点。因此,我将字符串转换为Python字典,循环遍历并进一步解析。如果我推翻解决方案,请指导我。

更新:代码

text = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"
data = text.strip("../Data.py:92 final computing result as shown below:  ")
print data

1 个答案:

答案 0 :(得分:1)

您输入的原始文本看起来非常可预测,请尝试以下方法:

>>> import re

>>> raw = "[historic_list {id: 'A(long) 11A' startdate: 42521 numvaluelist: 0.1065599566767107 datelist: 42521}historic_list {id: 'A(short) 11B' startdate: 42521 numvaluelist: 0.0038113334533441123 datelist: 42521 }historic_list {id: 'B(long) 11C' startdate: 42521 numvaluelist: 20.061623176440904 datelist: 42521}time_statistics {job_id: '' portfolio_id: '112341'} UrlPairList {}]"

>>> line_re = re.compile(r'\{[^\}]+\}')
>>> records = line_re.findall(raw)

>>> record_re = re.compile(
...     r"""
...             id:\s*\'(?P<id>[^']+)\'\s*
...             startdate:\s*(?P<startdate>\d+)\s*
...             numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
...             datelist:\s*(?P<datelist>\d+)\s*
...             """,
...     re.X
...     )

>>> record_parsed = record_re.search(line_re.findall(raw)[0])
>>> record_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'datelist': '42521', 'id': 'A(long) 11A'}

>>> for record in records:
...     record_parsed = record_re.search(record)
...     # Here is where you would do whatever you need with the fields.

解析id的子元素,例如:

>>> record_re2 = re.compile(
...     r"""
...             id:\s*\'
...                     (?P<id_letter>[A-Z]+)
...                     \(
...                             (?P<id_type>[^\)]+)
...                             \)\s*
...                     (?P<id_codenum>\d+)
...                     (?P<id_codeletter>[A-Z]+)
...                     \'\s*
...             startdate:\s*(?P<startdate>\d+)\s*
...             numvaluelist:\s*(?P<numvaluelist>[\d\.]+)\s*
...             datelist:\s*(?P<datelist>\d+)\s*
...             """,
...     re.X
...     )

>>> record2_parsed = record_re2.search(line_re.findall(raw)[0])
>>> record2_parsed.groupdict()
{'startdate': '42521', 'numvaluelist': '0.1065599566767107', 'id_letter': 'A', 'id_codeletter': 'A', 'datelist': '42521', 'id_type': 'long', 'id_codenum': '11'}