我正在尝试按攻击类型按邮政编码和受害者数量对我的犯罪总数进行排序。我按报告编号建了字典。这是我打印字典时输出的一小部分数据:
{'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}
字典构建如下:{Report_number:[Zipcode,攻击类型,受害者人数]}
我对编码很陌生,我只是在学习词典。我如何进行字典排序以将数据组织成这种格式?
Zip Codes Crime totals
====================
64126 809
64127 3983
64128 1749
64129 1037
64130 4718
64131 2080
64132 2060
64133 2005
64134 2928
非常感谢任何帮助。以下是我的代码。我正在访问两个包含大约50,000行数据的文件,因此效率非常重要。
from collections import Counter
incidents_f = open('incidents.csv', mode = "r")
crime_dict = dict()
for line in incidents_f:
line_1st = line.strip().split(",")
if line_1st[0].upper() != "REPORT_NO":
report_no = line_1st[0]
offense = line_1st[3]
zip_code = line_1st[4]
if len(zip_code) < 5:
zip_code = "99999"
if report_no in crime_dict:
crime_dict[report_no].append(zip_code).append(offense)
else:
crime_dict[report_no] = [zip_code]+[offense]
#close File
incidents_f.close
details_f = open('details.csv',mode = 'r')
for line in details_f:
line_1st = line.strip().split(",")
if line_1st[0].upper() != "REPORT_NO":
report_no = line_1st[0]
involvement = line_1st[1]
if involvement.upper() == 'VIC':
victims = "VIC"
if report_no in crime_dict:
crime_dict[report_no].append(victims)
else:
continue
#close File
details_f.close
print(crime_dict)
答案 0 :(得分:1)
使用比@ Alexander的解决方案更多的代码,这是一种方法:
crime_dict ={
'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'],
'20003319': ['64130', '13', 'VIC'],
'60077156': ['64130', '18', 'VIC'],
'100057708': ['99999', '17', 'VIC', 'VIC'],
'40024161': ['64108', '17', 'VIC', 'VIC']
}
crimes_by_zip = {}
for k, v in crime_dict.items():
zip = v[0]
if zip not in crimes_by_zip.keys():
crimes_by_zip[zip] = 0
crimes_by_zip[zip] += 1
for zip in sorted(crimes_by_zip.keys()):
print(zip, crimes_by_zip[zip])
64108 1
64130 3
99999 1
答案 1 :(得分:0)
D = {'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}
data_with_zip_duplicate = [(D[key][0],key) for key in sorted(D.keys(), key = lambda x:D[x][0] )]
print(*data_with_zip_duplicate, sep = "\n")