Question

我正在尝试按攻击类型按邮政编码和受害者数量对我的犯罪总数进行排序。我按报告编号建了字典。这是我打印字典时输出的一小部分数据：

{'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}

字典构建如下：{Report_number：[Zipcode，攻击类型，受害者人数]}

我对编码很陌生，我只是在学习词典。我如何进行字典排序以将数据组织成这种格式？

 Zip Codes Crime totals

====================

非常感谢任何帮助。以下是我的代码。我正在访问两个包含大约50,000行数据的文件，因此效率非常重要。

from collections import Counter

incidents_f =  open('incidents.csv', mode = "r")

crime_dict = dict()

for line in incidents_f:
    line_1st = line.strip().split(",")
    if line_1st[0].upper() != "REPORT_NO":
        report_no = line_1st[0]
        offense = line_1st[3]
        zip_code = line_1st[4]
        if len(zip_code) < 5:
            zip_code = "99999"

        if report_no in crime_dict:
            crime_dict[report_no].append(zip_code).append(offense)
        else:
            crime_dict[report_no] = [zip_code]+[offense]

#close File
incidents_f.close

details_f = open('details.csv',mode = 'r')
for line in details_f:
    line_1st = line.strip().split(",")
    if line_1st[0].upper() != "REPORT_NO":
        report_no = line_1st[0]
        involvement = line_1st[1]
        if involvement.upper() == 'VIC':
            victims = "VIC"

        if report_no in crime_dict:
            crime_dict[report_no].append(victims)
        else:
            continue


#close File
details_f.close



print(crime_dict)

Answer 1

使用比@ Alexander的解决方案更多的代码，这是一种方法：

crime_dict ={
    '100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], 
    '20003319': ['64130', '13', 'VIC'], 
    '60077156': ['64130', '18', 'VIC'],
    '100057708': ['99999', '17', 'VIC', 'VIC'], 
    '40024161': ['64108', '17', 'VIC', 'VIC']
    }

crimes_by_zip = {}
for k, v in crime_dict.items():
    zip = v[0]
    if zip not in crimes_by_zip.keys():
        crimes_by_zip[zip] = 0
    crimes_by_zip[zip] += 1

for zip in sorted(crimes_by_zip.keys()):
    print(zip, crimes_by_zip[zip])

64108 1
64130 3
99999 1

Answer 2

D = {'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}

data_with_zip_duplicate = [(D[key][0],key) for  key in sorted(D.keys(), key = lambda x:D[x][0] )]
print(*data_with_zip_duplicate, sep = "\n")

Python：如何排序和组织字典数据

2 个答案: