我正在使用eBay的api,并且在其JSON响应中包括许多不必要的数组。我正在尝试使用正则表达式删除这些数组,但无法提供所需的确切数组。
到目前为止,我想出了\[[^\{\}]*\]
,它与不包含花括号的方括号匹配
实际:
"childCategoryHistogram": [
{
"categoryId": [
"175673"
],
"categoryName": [
"Computer Components & Parts"
],
"count": [
"21"
]
},
{
"categoryId": [
"175672"
],
"categoryName": [
"Laptops & Netbooks"
],
"count": [
"9"
]
}
]
预期:
"childCategoryHistogram": [
{
"categoryId": "175673" ],
"categoryName": "Computer Components & Parts",
"count": "21"
},
{
"categoryId": "175672",
"categoryName": "Laptops & Netbooks",
"count": "9"
}
]
答案 0 :(得分:3)
正则表达式是这项工作的错误工具。请勿尝试更改JSON文本-更改其解析为的数据结构。
def remove_empty_lists(item):
if isinstance(item, list):
if len(item) == 1:
return remove_empty_lists(item[0])
else:
return [remove_empty_lists(n) for n in item]
elif isinstance(item, dict):
return {k: remove_empty_lists(v) for k, v in item.iteritems()}
else:
return item
...给出了根据您指定的输入“ Does The Right Thing”创建的Python数据结构:
>>> from pprint import pprint
>>> pprint(content)
{'childCategoryHistogram': [{'categoryId': ['175673'],
'categoryName': ['Computer Components & Parts'],
'count': ['21']},
{'categoryId': ['175672'],
'categoryName': ['Laptops & Netbooks'],
'count': ['9']}]}
>>> pprint(remove_empty_lists(content))
{'childCategoryHistogram': [{'categoryId': '175673',
'categoryName': 'Computer Components & Parts',
'count': '21'},
{'categoryId': '175672',
'categoryName': 'Laptops & Netbooks',
'count': '9'}]}