我正在使用suds
包来从网站查询API,从他们的网站返回的数据如下所示:
(1)。谁能告诉我这种格式是什么?
(2)。如果是这样,解析数据的最简单方法是什么?我使用BeautifulSoup处理了HTML/XML
格式的很多事情,但是在我抬起手指为这种格式编写正则表达式之前。我很好奇这是某种类型的“流行格式”,实际上已经写了一些漂亮的解析器。谢谢。
# Below are the header and tail of the response..
(DetailResult)
{
status = (Status){ message = None code = "0" }
searchArgument = (DetailSearchArgument){ reqPartNumber = "BQ" reqMfg = "T" reqCpn = None }
detailsDto[] = (DetailsDto){
summaryDto = (SummaryDto){ PartNumber = "BQ" seMfg = "T" description = "Fast" }
packageDto[] =
(PackageDto){ fetName = "a" fetValue = "b" },
(PackageDto){ fetName = "c" fetValue = "d" },
(PackageDto){ fetName = "d" fetValue = "z" },
(PackageDto){ fetName = "f" fetValue = "Sq" },
(PackageDto){ fetName = "g" fetValue = "p" },
additionalDetailsDto = (AdditionalDetailsDto){ cr = None pOptions = None inv = None pcns = None }
partImageDto = None
riskDto = (RiskDto){ life= "Low" lStage = "Mature" yteol = "10" Date = "2023"}
partOptionsDto[] = (ReplacementDto){ partNumber = "BQ2" manufacturer = "T" type = "Reel" },
inventoryDto[] =
(InventoryDto){ distributor = "V" quantity = "88" buyNowLink = "https://www..." },
(InventoryDto){ distributor = "R" quantity = "7" buyNowLink = "http://www.r." },
(InventoryDto){ distributor = "RS" quantity = "2" buyNowLink = "http://www.rs.." },
},
}
答案 0 :(得分:2)
这看起来像某种嵌套的repr输出,类似于JSON但是带有 结构或对象名称信息(“状态包含消息和代码”)。 如果它是嵌套的,单独使用正则表达式将无法完成任务。这是一个粗略的通过pyparsing 解析器
sample = """
... given sample text ...
"""
from pyparsing import *
# punctuation
LPAR,RPAR,LBRACE,RBRACE,LBRACK,RBRACK,COMMA,EQ = map(Suppress,"(){}[],=")
identifier = Word(alphas,alphanums+"_")
# define some types that can get converted to Python types
# (parse actions will do conversion at parse time)
NONE = Keyword("None").setParseAction(replaceWith(None))
integer = Word(nums).setParseAction(lambda t:int(t[0]))
quotedString.setParseAction(removeQuotes)
# define a placeholder for a nested object definition (since objDefn
# will be referenced within its own definition)
objDefn = Forward()
objType = Combine(LPAR + identifier + RPAR)
objval = quotedString | NONE | integer | Group(objDefn)
objattr = Group(identifier + EQ + objval)
arrayattr = Group(identifier + LBRACK + RBRACK + EQ + Group(OneOrMore(Group(objDefn)+COMMA)) )
# use '<<' operator to assign content to previously declared Forward
objDefn << objType + LBRACE + ZeroOrMore((arrayattr | objattr) + Optional(COMMA)) + RBRACE
# parse sample text
result = objDefn.parseString(sample)
# use pprint to list out indented parsed data
import pprint
pprint.pprint(result.asList())
打印:
['DetailResult',
['status', ['Status', ['message', None], ['code', '0']]],
['searchArgument',
['DetailSearchArgument',
['reqPartNumber', 'BQ'],
['reqMfg', 'T'],
['reqCpn', None]]],
['detailsDto',
[['DetailsDto',
['summaryDto',
['SummaryDto',
['PartNumber', 'BQ'],
['seMfg', 'T'],
['description', 'Fast']]],
['packageDto',
[['PackageDto', ['fetName', 'a'], ['fetValue', 'b']],
['PackageDto', ['fetName', 'c'], ['fetValue', 'd']],
['PackageDto', ['fetName', 'd'], ['fetValue', 'z']],
['PackageDto', ['fetName', 'f'], ['fetValue', 'Sq']],
['PackageDto', ['fetName', 'g'], ['fetValue', 'p']]]],
['additionalDetailsDto',
['AdditionalDetailsDto',
['cr', None],
['pOptions', None],
['inv', None],
['pcns', None]]],
['partImageDto', None],
['riskDto',
['RiskDto',
['life', 'Low'],
['lStage', 'Mature'],
['yteol', '10'],
['Date', '2023']]],
['partOptionsDto',
[['ReplacementDto',
['partNumber', 'BQ2'],
['manufacturer', 'T'],
['type', 'Reel']]]],
['inventoryDto',
[['InventoryDto',
['distributor', 'V'],
['quantity', '88'],
['buyNowLink', 'https://www...']],
['InventoryDto',
['distributor', 'R'],
['quantity', '7'],
['buyNowLink', 'http://www.r.']],
['InventoryDto',
['distributor', 'RS'],
['quantity', '2'],
['buyNowLink', 'http://www.rs..']]]]]]]]