从繁琐的键值字符串创建字典

时间:2019-07-15 22:31:45

标签: python-3.x parsing

我需要转换类似于以下内容的字符串列表:

"ABOC000  RECORD     0 Msg-type\=0220  Bit-map\=3450G83H403894JH Xbit-map\=0000000000000010 Proc code\=312000  Tran amt\=000000000000 Tran datetime\=0613064645 Trace nbr\=000000 Local time\=02:46:37 Local date\=06/13 Exp date\=24/02 Sett date\=06/13 Merchant\=6011 Pos entry\=051 Card seq no\=000 Acqr inst id\=2349823498 Cord \=23049583049583405983045983405983405900 Retr ref\=111111111111  Resp code\=00  Crd acpt trmid\=CS61252 Crd acpt id\=ISPA/PULSE Crd acpt loc\=000 8TH AVENUE         BOREALIS     XXUS Name\=MERCHANT NAME Tran curr\=840 Natl cond code\=1010000002U Reason codes\=004 Rsn code map\=40 Advice reason\=31 Ddsi data len\=022 Ddsi data map\=B2 Pseudo term\=070792 Acqr netid\=PUL Processor id\=INT789 Proc flags\= Info text\=NI24PS20ID16              03 "

进入包含键/值的字典列表。

这是使用python 3.7的-我已经了解了列表理解和正则表达式的路径,但是还没有找到可行的解决方案。困难在于:

  • 键和值有时是多个单词(带有空格)
  • 某些键并不总是存在

我打算最终得到的简短示例:

[{"RECORD":"0", "Msg-type":"0220", "Bit-map":"3450G83H403894JH", "Xbit-map":"0000000000000010", "Proc code":"312000" ... }]

1 个答案:

答案 0 :(得分:1)

假定值不包含任何空格(否则将很难区分哪个部分属于前一个值或下一个键),并剥离字符串的开头(不确定{ 1}}适合图片),则可以将{'RECORD': '0'}与以下正则表达式配合使用:

re.findall

哪个给:

s = r"Msg-type\=0220  Bit-map\=3450G83H403894JH Xbit-map\=0000000000000010 Proc code\=312000  Tran amt\=000000000000 Tran datetime\=0613064645 Trace nbr\=000000 Local time\=02:46:37 Local date\=06/13 Exp date\=24/02 Sett date\=06/13 Merchant\=6011 Pos entry\=051 Card seq no\=000 Acqr inst id\=2349823498 Cord \=23049583049583405983045983405983405900 Retr ref\=111111111111  Resp code\=00  Crd acpt trmid\=CS61252 Crd acpt id\=ISPA/PULSE Crd acpt loc\=000 8TH AVENUE         BOREALIS     XXUS Name\=MERCHANT NAME Tran curr\=840 Natl cond code\=1010000002U Reason codes\=004 Rsn code map\=40 Advice reason\=31 Ddsi data len\=022 Ddsi data map\=B2 Pseudo term\=070792 Acqr netid\=PUL Processor id\=INT789 Proc flags\= Info text\=NI24PS20ID16              03 "
d = dict(re.findall(r'([A-Za-z][A-Za-z \-]*)\\=([^\s]+)', s))