我有一个具有特定输出格式的json文件。
{
"courses": [
{
"professors": [
{
"first_name": "Zvezdelina",
"last_name": "Stankova",
"professor_url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=375269",
"helpfullness": 4.3,
"clarity": 4.3,
"overall_rating": 4.3
}
],
"course_name": "CHEM 1",
"course_mentioned_times": 37
},
{
"professors": [
{
"first_name": "Alan",
"last_name": "Shabel",
"professor_url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=1309831",
"helpfullness": 3.9,
"clarity": 3.5,
"overall_rating": 3.7
}
],
"course_name": "CHEMISTRY 5467",
"course_mentioned_times": 32
},
{
"professors": [
{
"first_name": "Kurt",
"last_name": "Spreyer",
"professor_url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=706268",
"helpfullness": 3.8,
"clarity": 3.6,
"overall_rating": 3.7
}
],
"course_name": "ESPM 50",
"course_mentioned_times": 18
},
{
"professors": [
{
"first_name": "Kurt",
"last_name": "Spreyer",
"professor_url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=706268",
"helpfullness": 3.8,
"clarity": 3.6,
"overall_rating": 3.7
}
],
"course_name": "ESPM 56",
"course_mentioned_times": 17
}
]
}
如输出中所示,我们总共有四个['course_name']。他们是'CHEM 1','CHEMISTRY 5467','ESPM 56','ESPM 50','course_tioned_times。所以我没有得到的是如何在我的json文件中查看我的所有courses_name键。例如,在这种情况下,从每个课程中提取MOST提到的课程。我只想要CHEM 1和它的属性,因为它比CHEMISTRY 5476明显多了56倍,我想要ESPM 50,因为它被提到18倍于ESPM 56,仅提到了17次。所以我希望我的输出有这两个类及其所有属性。比较应该通过跳过整数的第一个字母来完成,例如只有CHEM和CHEMISTRY,但是在我的输出中我想要全名而不仅仅是前缀。
答案 0 :(得分:1)
以下代码段将使用提及次数最多的课程更新json文件:
import json
# Reading the json data from the source file = data.json
with open('data.json') as data_file:
data = json.load(data_file)
temp_data = data
greater = []
len1 = len(data['courses'])
len2 = len1
for i in range(0,len1):
for j in range(0, len2):
if i==j:
continue
if data['courses'][i]['course_name'][0] == temp_data['courses'][j]['course_name'][0]:
if data['courses'][i]['course_name'][1] == temp_data['courses'][j]['course_name'][1]:
if data['courses'][i]['course_name'][2] == temp_data['courses'][j]['course_name'][2]:
if data['courses'][i]['course_mentioned_times']> temp_data['courses'][j]['course_mentioned_times']:
greater.append(i)
else:
greater.append(j)
final = []
for i in greater:
if i not in final:
final.append(i)
list_order = []
for i in range(0,len(data['courses'])):
list_order.append(i)
new_final = []
for i in list_order:
if i not in final:
new_final.append(i)
for i in new_final:
if i!=new_final[0]:
i=i-1
data['courses'].pop(i)
# Writing the new json data back to data.json file.
with open('data.json', 'w') as f:
json.dump(data, f)
运行我的解决方案后提供的示例数据的输出如下所示:
{
"courses": [
{
"professors": [
{
"first_name": "Zvezdelina",
"last_name": "Stankova",
"professor_url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=375269",
"helpfullness": 4.3,
"clarity": 4.3,
"overall_rating": 4.3
}
],
"course_name": "CHEM 1",
"course_mentioned_times": 37
},
{
"professors": [
{
"first_name": "Kurt",
"last_name": "Spreyer",
"professor_url": "http://www.ratemyprofessors.com/ShowRatings.jsp?tid=706268",
"helpfullness": 3.8,
"clarity": 3.6,
"overall_rating": 3.7
}
],
"course_name": "ESPM 50",
"course_mentioned_times": 18
}
]
}