有序字典和排序

时间:2016-09-27 21:10:11

标签: python sorting for-loop dictionary ordereddictionary

我试图解决一个简单的练习测试问题:

  

将CSV文件解析为:

     
      
  • 仅查找用户在2010年9月6日之前开始的行。
  •   
  • 接下来,从"单词"中排序值。列按升序排列(按开始日期)
  •   
  • 返回已编译的"隐藏"短语
  •   

csv文件有19列和1000行数据。其中大部分都是无关紧要的。正如问题所述,我们只关注按升序对start_date列进行排序,以从“'单词”中获取关联的单词。柱。这些词语将共同提供隐藏的"短语。

源文件中的日期是UTC时间格式,所以我必须转换它们。我现在认为我选择了正确的行,但我在排序日期方面遇到了问题。

这是我的代码:

'MyForm2' : undeclared identifier

当我import csv from collections import OrderedDict from datetime import datetime with open('TSE_sample_data.csv', 'rb') as csvIn: reader = csv.DictReader(csvIn) for row in reader: #convert from UTC to more standard date format startdt = datetime.fromtimestamp(int(row['start_date'])) new_startdt = datetime.strftime(startdt, '%Y%m%d') # find dates before Sep 6th, 2010 if new_startdt < '20100906': # add the values from the 'words' column to a list words = [] words.append(row['words']) # add the dates to a list dates = [] dates.append(new_startdt) # create an ordered dictionary to sort the dates... this is where I'm having issues dict1 = OrderedDict(zip(words, dates)) print dict1 #print list(dict1.items())[0][1] #dict2 = sorted([(y,x) for x,y in dict1.items()]) #print dict2 我希望有一个有序词典时,单词和日期作为项目包含在内。相反,我得到的是为每个创建的键值对多个有序词典。

1 个答案:

答案 0 :(得分:0)

以下是更正后的版本:

import csv
from collections import OrderedDict
from datetime import datetime


with open('TSE_sample_data.csv', 'rb') as csvIn:
    reader = csv.DictReader(csvIn)
    words = []
    dates = []
    for row in reader:

        #convert from UTC to more standard date format
        startdt = datetime.fromtimestamp(int(row['start_date']))
        new_startdt = datetime.strftime(startdt, '%Y%m%d')        

        # find dates before Sep 6th, 2010
        if new_startdt < '20100906':

            # add the values from the 'words' column to a list 
            words.append(row['words'])
            # add the dates to a list
            dates.append(new_startdt)

    # This is where I was going wrong! Had to move the lines below outside of the for loop
    # Originally, because I was still inside the for loop, I was creating a new Ordered Dict for each "row in reader" that met my if condition
    # By doing this outside of the for loop, I'm able to create the ordered dict storing all of the values that have been found in tuples inside the ordered dict
    # create an ordered dictionary to sort by the dates
    dict1 = OrderedDict(zip(words, dates))
    dict2 = sorted([(y,x) for x,y in dict1.items()])

    # print the hidden message
    for i in dict2: 
        print i[1]