Python:如何排序和组织字典数据

时间:2015-12-02 07:11:28

标签: python performance list csv dictionary

我正在尝试按攻击类型按邮政编码和受害者数量对我的犯罪总数进行排序。我按报告编号建了字典。这是我打印字典时输出的一小部分数据:

{'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}

字典构建如下:{Report_number:[Zipcode,攻击类型,受害者人数]}

我对编码很陌生,我只是在学习词典。我如何进行字典排序以将数据组织成这种格式?

 Zip Codes Crime totals 

====================

   64126 809
   64127 3983

   64128 1749
   64129 1037
   64130 4718
   64131 2080
   64132 2060
   64133 2005
   64134 2928

非常感谢任何帮助。以下是我的代码。我正在访问两个包含大约50,000行数据的文件,因此效率非常重要。

from collections import Counter

incidents_f =  open('incidents.csv', mode = "r")

crime_dict = dict()

for line in incidents_f:
    line_1st = line.strip().split(",")
    if line_1st[0].upper() != "REPORT_NO":
        report_no = line_1st[0]
        offense = line_1st[3]
        zip_code = line_1st[4]
        if len(zip_code) < 5:
            zip_code = "99999"

        if report_no in crime_dict:
            crime_dict[report_no].append(zip_code).append(offense)
        else:
            crime_dict[report_no] = [zip_code]+[offense]

#close File
incidents_f.close

details_f = open('details.csv',mode = 'r')
for line in details_f:
    line_1st = line.strip().split(",")
    if line_1st[0].upper() != "REPORT_NO":
        report_no = line_1st[0]
        involvement = line_1st[1]
        if involvement.upper() == 'VIC':
            victims = "VIC"

        if report_no in crime_dict:
            crime_dict[report_no].append(victims)
        else:
            continue


#close File
details_f.close



print(crime_dict)

2 个答案:

答案 0 :(得分:1)

使用比@ Alexander的解决方案更多的代码,这是一种方法:

crime_dict ={
    '100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], 
    '20003319': ['64130', '13', 'VIC'], 
    '60077156': ['64130', '18', 'VIC'],
    '100057708': ['99999', '17', 'VIC', 'VIC'], 
    '40024161': ['64108', '17', 'VIC', 'VIC']
    }

crimes_by_zip = {}
for k, v in crime_dict.items():
    zip = v[0]
    if zip not in crimes_by_zip.keys():
        crimes_by_zip[zip] = 0
    crimes_by_zip[zip] += 1

for zip in sorted(crimes_by_zip.keys()):
    print(zip, crimes_by_zip[zip])

64108 1
64130 3
99999 1

答案 1 :(得分:0)

D = {'100065070': ['64130', '18', 'VIC', 'VIC', 'VIC'], '20003319': ['64130', '13', 'VIC'], '60077156': ['64130', '18', 'VIC'], '100057708': ['99999', '17', 'VIC', 'VIC'], '40024161': ['64108', '17', 'VIC', 'VIC']}

data_with_zip_duplicate = [(D[key][0],key) for  key in sorted(D.keys(), key = lambda x:D[x][0] )]
print(*data_with_zip_duplicate, sep = "\n")