我有一个包含这些数据的.csv文件:
equipement,"144444444"
Date,"Time","measure"
16/09/2016,"07:15:00","16.47777"
16/09/2016,"07:30:00","15.44454"
16/09/2016,"07:45:00","16.21114"
我在这个文件上运行一个python代码,我的目标是将这样的东西作为输出:
"measure","20160916071500","16.47777"
"measure","20160916073000","15.44454"
"measure","20160916074500","16.21114"
这是我的代码:
import csv
import sys
import os
import re
import fnmatch
import csv
from dateutil.parser import parse as parseDate
from datetime import datetime, time, timedelta
file = open("myfile.csv", 'rt')
reader = csv.reader(file)
next(reader)
rows = list(reader)
firstline = rows[0]
header = firstline[2]
print header
for row in rows:
next(reader)
print rows[0]
if "".join(row).strip() != "":
chaine = str(row[0]+row[1])
#print chaine
date = chaine[:10] + " " + chaine[11:]
#print date
index = parseDate(date)
index = str(index).replace('-','')
index = str(index).replace(':','')
index = str(index).replace(' ','')
data = row[2]
我的问题是,我需要做下一个(读者)跳过文件中的第一行和第二行,因为它们不包含任何日期。但我得到这个错误:
Traceback (most recent call last): File "t.py", line 19, in <module> next(reader) StopIteration
有什么想法吗?
答案 0 :(得分:3)
通过执行rows = list(reader)
,您已经用尽reader
并将结果收集到名为rows
的列表中。再次执行next(reader)
会提升StopIteration
。
但是,创建rows
列表并不是必需的。您可以使用reader
循环直接迭代for
:
reader = csv.reader(file)
next(reader) # skip first line
secondline = next(reader) # capture second line
header = secondline[2]
for row in reader: # iterate from third line to the end
# next(reader) <-- don't do this, the for loop already does it for you
if "".join(row).strip() != "":
# ... your code processing row ...
答案 1 :(得分:3)
如果您愿意,可以使用pandas解决它:
import pandas as pd
df = pd.read_csv('in.csv', skiprows=2, header=None, parse_dates=[[0,1]])
df['dt']=df["0_1"].apply(lambda x: x.strftime('%Y%m%d%H%M%S'))
df['mes'] = pd.Series(["measure"]*len(df), index=df.index)
df[['mes','dt',2]].to_csv('out.csv', quoting=True, index=None,header=None)
CSV文件:
"measure","20160916071500","16.47777"
"measure","20160916073000","15.44454"
"measure","20160916074500","16.21114"
答案 2 :(得分:0)
您可以仅使用two for loops
获得相同的所需输出,并使用此示例中的一些字符串替换(我假设您的输入称为in.csv
):
data = list(k.strip("\n") for k in open("in.csv", 'r'))
mesure = data[1].split(",")[2]
m = list(k.replace('"', "").split(",") for k in data[2:])
final, d =[], ""
for k in m:
for j in k[:-1]:
if "/" in j:
d = '"%s' % "".join(j.split("/")[::-1])
if ":" in j:
d += '%s"' % "".join(j.split(":"))
final.append(",".join([mesure, d,'"%s"' % k[-1:][0]]))
for k in final:
print(k)
输出:
"measure","20160916071500","16.47777"
"measure","20160916073000","15.44454"
"measure","20160916074500","16.21114"