我试图解决一个简单的练习测试问题:
将CSV文件解析为:
- 仅查找用户在2010年9月6日之前开始的行。
- 接下来,从"单词"中排序值。列按升序排列(按开始日期)
- 返回已编译的"隐藏"短语
csv文件有19列和1000行数据。其中大部分都是无关紧要的。正如问题所述,我们只关注按升序对start_date列进行排序,以从“'单词”中获取关联的单词。柱。这些词语将共同提供隐藏的"短语。
源文件中的日期是UTC时间格式,所以我必须转换它们。我现在认为我选择了正确的行,但我在排序日期方面遇到了问题。
这是我的代码:
'MyForm2' : undeclared identifier
当我import csv
from collections import OrderedDict
from datetime import datetime
with open('TSE_sample_data.csv', 'rb') as csvIn:
reader = csv.DictReader(csvIn)
for row in reader:
#convert from UTC to more standard date format
startdt = datetime.fromtimestamp(int(row['start_date']))
new_startdt = datetime.strftime(startdt, '%Y%m%d')
# find dates before Sep 6th, 2010
if new_startdt < '20100906':
# add the values from the 'words' column to a list
words = []
words.append(row['words'])
# add the dates to a list
dates = []
dates.append(new_startdt)
# create an ordered dictionary to sort the dates... this is where I'm having issues
dict1 = OrderedDict(zip(words, dates))
print dict1
#print list(dict1.items())[0][1]
#dict2 = sorted([(y,x) for x,y in dict1.items()])
#print dict2
我希望有一个有序词典时,单词和日期作为项目包含在内。相反,我得到的是为每个创建的键值对多个有序词典。
答案 0 :(得分:0)
以下是更正后的版本:
import csv
from collections import OrderedDict
from datetime import datetime
with open('TSE_sample_data.csv', 'rb') as csvIn:
reader = csv.DictReader(csvIn)
words = []
dates = []
for row in reader:
#convert from UTC to more standard date format
startdt = datetime.fromtimestamp(int(row['start_date']))
new_startdt = datetime.strftime(startdt, '%Y%m%d')
# find dates before Sep 6th, 2010
if new_startdt < '20100906':
# add the values from the 'words' column to a list
words.append(row['words'])
# add the dates to a list
dates.append(new_startdt)
# This is where I was going wrong! Had to move the lines below outside of the for loop
# Originally, because I was still inside the for loop, I was creating a new Ordered Dict for each "row in reader" that met my if condition
# By doing this outside of the for loop, I'm able to create the ordered dict storing all of the values that have been found in tuples inside the ordered dict
# create an ordered dictionary to sort by the dates
dict1 = OrderedDict(zip(words, dates))
dict2 = sorted([(y,x) for x,y in dict1.items()])
# print the hidden message
for i in dict2:
print i[1]