我对Python还是很陌生,所以如果这个问题可能是简单的解决或错误,请原谅。如果您看下面的代码,我正在尝试从CSV文件中解析数据。特别是,我试图解析升序的两个日期之间创建的用户。在这两个日期之间创建的任何用户,都应按升序打印。我的日期列row[1]
以Unix时间显示。还应该打印一个单词列row[8]
。目的是按升序分析日期时,打印的单词列row[8]
形成特定的短语。问题是当我按照Pycharm中当前的代码执行代码时,在第15行IndexError: list out of range
处收到creation_date = date.fromtimestamp(int(row[1]))
。我知道Panda可以更好地处理CSV文件,但是我试图避免为这一任务学习Panda。
import csv
from datetime import datetime, date
import sys
start_date = date(2014, 6, 22)
end_date = date(2014, 7, 22)
# Read csv data into memory filtering rows by the date in column 2 (row[1]).
csv_data = []
with open('sample.csv', newline='') as f:
reader = csv.reader(f, delimiter='\t')
header = next(reader)
csv_data.append(header)
for row in reader:
creation_date = date.fromtimestamp(int(row[1]))
if start_date <= creation_date <= end_date:
csv_data.append(row)
if csv_data: # Anything found?
# Print the results in ascending date order.
print(" ".join(csv_data[0]))
# Converting the timestamp to int may not be necessary (but doesn't hurt)
for row in sorted(csv_data[1:], key=lambda r: int(r[1])):
print(" ".join(row))
答案 0 :(得分:1)
您正在尝试访问的数据似乎不在该行中的值(因为该行只有一个值)。
您可以将崩溃的代码包装在try/except
中,然后查看失败的行:
for row in reader:
try:
creation_date = date.fromtimestamp(int(row[1]))
except IndexError:
print("Cannot get value for row: {}".format(row))
continue
if start_date <= creation_date <= end_date:
csv_data.append(row)
这应该让您初步了解为什么它在这里崩溃(也许您的数据不是用制表符分隔的?)
答案 1 :(得分:0)
您共享的csv用,
分隔。所以当你说
reader = csv.reader(f, delimiter='\t') // returns a single column
您应该将其替换为
reader = csv.reader(f, delimiter=',')
实际代码:
import csv
from datetime import datetime, date
import sys
start_date = date(2014, 6, 22)
end_date = date(2014, 7, 22)
# Read csv data into memory filtering rows by the date in column 2 (row[1]).
csv_data = []
with open('sample_data.csv','r') as f:
reader = csv.reader(f, delimiter='\t')
header = next(reader)
csv_data.append(header)
for row in reader:
creation_date = date.fromtimestamp(int(row[1]))
if start_date <= creation_date <= end_date:
csv_data.append(row)
if csv_data: # Anything found?
# Print the results in ascending date order.
print(" ".join(csv_data[0]))
# Converting the timestamp to int may not be necessary (but doesn't hurt)
for row in sorted(csv_data[1:], key=lambda r: int(r[1])):
print(" ".join(row))