我有一个字符串,列出了请求事件的属性。
我的字符串如下:
requestBody: {
propertyA = 1
propertyB = 2
propertyC = {
propertyC1 = 1
propertyC2 = 2
}
propertyD = [
{ propertyD1 = { propertyD11 = 1}},
{ propertyD1 = [ {propertyD21 = 1, propertyD22 = 2},
{propertyD21 = 3, propertyD22 = 4}]}
]
}
我试图将"="
替换为":"
,以便我可以将它放入python的JSON阅读器中,但JSON还要求将键和值存储在带双引号的字符串中","
分隔每个KV对。这实现起来有点复杂。有哪些更好的方法可以将其解析为具有完全相同结构的 python词典(例如,还保留了嵌入式词典)?
问题: 如果我要编写一个完整的解析器,那么我应该解决的主要模式是什么?将括号存储在堆栈中直到括号完成?
答案 0 :(得分:2)
这是使用pyparsing的一个很好的例子,特别是因为它增加了递归结构的问题。
简短回答是以下解析器(处理前导requestBody :
之后的所有内容):
LBRACE,RBRACE,LBRACK,RBRACK,EQ = map(Suppress, "{}[]=")
NL = LineEnd().setName("NL")
# define special delimiter for lists and objects, since they can be
# comma-separated or just newline-separated
list_delim = NL | ','
list_delim.leaveWhitespace()
# use a parse action to convert numeric values to ints or floats at parse time
def convert_number(t):
try:
return int(t[0])
except ValueError:
return float(t[0])
number = Word(nums, nums+'.').addParseAction(convert_number)
qs = quotedString
# forward-declare value, since it will be defined recursively
obj_value = Forward()
ident = Word(alphas, alphanums+'_')
obj_property = Group(ident + EQ + obj_value)
# use Dict wrapper to auto-define nested properties as key-values
obj = Group(LBRACE + Dict(Optional(delimitedList(obj_property, delim=list_delim))) + RBRACE)
obj_array = Group(LBRACK + Optional(delimitedList(obj, delim=list_delim)) + RBRACK)
# now assign to previously-declared obj_value, using '<<=' operator
obj_value <<= obj_array | obj | number | qs
# parse the data
res = obj.parseString(sample)[0]
# convert the result to a dict
import pprint
pprint.pprint(res.asDict())
给出
{'propertyA': 1,
'propertyB': 2,
'propertyC': {'propertyC1': 1, 'propertyC2': 2},
'propertyD': {'propertyD1': {'propertyD11': 1},
'propertyD2': {'propertyD21': 3, 'propertyD22': 4}}}