我有一个以下格式的json:
{
"features": [{
"geometry": {
"coordinates": [
[
[-12.345, 26.006],
[-78.56, 24.944],
[-76.44, 24.99],
[-76.456, 26.567],
[-78.345, 26.23456]
]
],
"type": "Polygon"
},
"id": "Some_ID_01",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
},
{
"geometry": {
"coordinates": [
[
[139.345, 39.2345],
[139.23456, 37.3465],
[141.678, 37.7896],
[141.2345, 39.6543],
[139.7856, 39.2345]
]
],
"type": "Polygon"
},
"id": "Some_OtherID_01",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
}, {
"geometry": {
"coordinates": [
[
[143.8796, -30.243],
[143.456, -32.764],
[145.3452, -32.76],
[145.134, -30.87],
[143.123, -30.765]
]
],
"type": "Polygon"
},
"id": "Some_ID_02",
"properties": {
"parameters": "elevation"
},
"type": "Feature"
}
],
"type": "FeatureCollection"
}
我试图根据id字段删除任何重复/旧版本的json对象(即id=Some_ID_01
和id=Some_ID_02
的对象被视为重复项目。)
到目前为止,我已经设法将json解析为python并创建一个需要删除的所有ID的列表。我实际上使用该列表来删除/弹出我解析的json中的对象,因此我可以将结果重写为新的json文件,更不用说它远未优化(我的json文件中有大约20k个对象) )
到目前为止,这是我的python代码:
import json
json_file = open('features.json')
json_str = json_file.read()
json_data = json.loads(json_str)
dictionaryOfJsonId = {}
removalCounter = 0
keyToRemove = []
valueToRemoveFromList = []
IDList = []
removedSometing = 0
for values in json_data['features']: #This loop converts the values in the json parse into a dict of only ID
stringToSplit = values["id"] #the id values from the json file
IDList.append(stringToSplit) #list with all the ID
newKey = stringToSplit[:-2] #takes the initial substring up to the last 2 spaces (version)
newValue = stringToSplit[-2:] #grabs the last two characters of the string
if newKey in dictionaryOfJsonId:
dictionaryOfJsonId[newKey].append(newValue)
else:
dictionaryOfJsonId[newKey] = [newValue]
for key in dictionaryOfJsonId: #Remove entries that do not have duplicates
if len(dictionaryOfJsonId[key])<2:
valueToRemoveFromList.append(str(key + dictionaryOfJsonId[key][0]))
else:
valueToRemoveFromList.append(str(key +max(dictionaryOfJsonId[key])))
for string in valueToRemoveFromList: #Remove all values that don't have duplicates from the List of ID
IDList.remove(string)
removalCounter+=1
for i in json_data['features']:
for x in IDList:
if i['id'] == x:
json_data.pop(i)
最后一个for循环是我最近尝试删除的,但是我得到了错误:
TypeError:不可用类型:'dict'
答案 0 :(得分:0)
您收到错误是因为pop
需要索引,而不是对象。
然而,自it's a bad idea to modify a list that you're iterating over以来,这有点无关紧要。
我考虑使用列表理解;像good_features = [i for i in json_data['feature'] if i['id'] not in IDList]