我必须遍历一个json文件。我作为列表导入python并打印所有独特的位置。但是,我编写的代码只打印了前几个地方,并没有遍历列表中的所有538个元素:
import pandas as pd
import json
with open('data.json') as json_file:
json_file = json_file.readlines()
json_file = dict(map(json.loads, json_file))
for i in range (0, len(json_file)-1):
unique = json_file[i]['payload']['locality']
print(unique)
相反它仍然只打印约30个地方,我该如何解决这个问题呢?
下面是我文件的片段:
{ 'payload': {'existence_full': 1,
'geo_virtual': '["50.794876|-1.090893|20|within_50m|4"]',
'latitude': '50.794876',
'locality': 'Portsmouth',
'_records_touched': '{"crawl":16,"lssi":0,"polygon_centroid":0,"geocoder":0,"user_submission":0,"tdc":0,"gov":0}',
'email': 'info.centre@port.ac.uk',
'existence_ml': 0.9794948816203205,
'address': 'Winston Churchill Av',
'longitude': '-1.090893',
'domain_aggregate': '',
'name': 'University of Portsmouth',
'search_tags': ['The University of Portsmouth',
'The University of Portsmouth Students Union',
'University House'],
'admin_region': 'England',
'existence': 1,
'post_town': 'Portsmouth',
'category_labels': [['Community and Government',
'Education',
'Colleges and Universities']],
'region': 'Hampshire',
'review_count': '1',
'geocode_level': 'within_50m',
'tel': '023 9284 8484',
'placerank': 42,
'placerank_ml': 69.2774043602657,
'address_extended': 'Unit 4',
'category_ids_text_search': '',
'fax': '023 9284 3122',
'website': 'http://www.port.ac.uk',
'status': '1',
'neighborhood': ['The Waterfront'],
'geocode_confidence': '20',
'postcode': 'PO1 2UP',
'category_ids': [29],
'country': 'gb',
'_geocode_quality': '4'},
'uuid': '297fa2bf-7915-4252-9a55-96a0d44e358e'}
答案 0 :(得分:1)
您尚未将数据导入列表,而是导入字典。如果你想将json导入列表,你可以这样做:
import json
with open('data.json') as json_file:
json_array = json.load(json_file)
for item in json_array:
unique = item['payload']['locality']
print(unique)
您说您想要打印所有独特的地区,但在您的代码中,您打印所有地区而不检查它们是否是唯一的。
答案 1 :(得分:0)
问题可能是您的文件中有重复记录,并且当您将数据加载到dict
时,密钥会发生冲突,导致其中一部分被丢弃。不是将数据存储在dict
中,而是将其存储在list
中:
import json
with open('data.json') as json_file:
lines = json_file.readlines()
records = list(map(json.loads, lines))
然后,您应该能够遍历records
:
print('There are %d records' % len(records))
for record in records:
unique = record['payload']['locality']
print(unique)