我有一个像这样的json样本:
{"ratings": [{
"TERM": "movie1",
"Rating": "3.5",
"source": "15786"
},
{
"TERM": "movie2",
"Rating": "3.5",
"source": "15786"
},
{
"TERM": "Movie1",
"Rating": "3.0",
"source": "15781"
}
]}
现在我想创建一个新的json文件,过滤的逻辑是忽略一个json对象,如果TERM已经存在一次。因此,对于此示例,输出将为
{"ratings": [{
"TERM": "movie1",
"Rating": "3.5",
"source": "15786"
},
{
"TERM": "movie2",
"Rating": "3.5",
"source": "15786"
}
]}
由于movie1已存在于索引0中,我们希望忽略索引2。
我提出了以下逻辑,这对于小样本来说效果很好。我有一个json数组大小为10毫米的样本,下面的代码需要2 +天才能完成。我想知道是否有更有效的方法来做到这一点:
import json
import io
input1 = "movies.json"
res=[]
resTerms = []
with io.open(input1, encoding="utf8") as json_data:
d = json.load(json_data)
print(len(d['ratings']))
for x in d['ratings']:
if x['TERM'].lower() not in resTerms:
res.append(x)
resTerms.append(x['TERM'].lower())
final ={}
final["ratings"] = res
output = "myFileSelected.json"
with io.open(output, 'w') as outfile:
json.dump(final, outfile)
答案 0 :(得分:2)
这里的问题是当您检查该术语是否已经存在时(即当您检查if x['TERM'].lower() not in resTerms:
时)。这是因为resTerms
是list
,查找复杂度为O(n),因此整个算法变为O(n ^ 2)
解决此问题的方法是使用set
代替list
,如果O(1),则import json
import io
input1 = "movies.json"
res=[]
resTerms = set()
with io.open(input1, encoding="utf8") as json_data:
d = json.load(json_data)
print(len(d['ratings']))
for x in d['ratings']:
if x['TERM'].lower() not in resTerms:
res.append(x)
resTerms.add(x['TERM'].lower())
具有查找关联性。然后你的循环看起来像这样(你也不需要保持json文件打开)
oncreate(....)
{.....
View v = inflater.inflate(R.layout.fragment_layout,container,false);
RelativeLayout l = v.findViewById(R.id.rl);
l.setOnClickListener(this);
.....}
@Override
public void onClick(View v)
{
if(v.getId == R.id.rl)
{
//do your work
}
}
可以在此处找到Python数据结构和时间复杂性的方便指南: https://wiki.python.org/moin/TimeComplexity