我正在尝试扫描csv文件并逐行进行调整。最后,我想删除最后一行。如何删除同一扫描循环中的最后一行?
我的代码从原始文件中读取,进行调整,最后写入新文件。
import csv
raw_data = csv.reader(open("original_data.csv", "r"), delimiter=",")
output_data = csv.writer(open("final_data.csv", "w"), delimiter=",")
lastline = # integer index of last line
for i, row in enumerate(raw_data):
if i == 10:
# some operations
output_data.writerow(row)
elif i > 10 and i < lastline:
# some operations
output_data.writerow(row)
elif i == lastline:
output_data.writerow([])
else:
continue
答案 0 :(得分:4)
您可以使生成器生成除最后一个元素之外的所有元素:
def remove_last_element(iterable):
iterator = iter(iterable)
try:
prev = next(iterator)
while True:
cur = next(iterator)
yield prev
prev = cur
except StopIteration:
return
然后你只需将raw_data
包裹在其中:
for i, row in enumerate(remove_last_element(raw_data)):
# your code
最后一行将被自动忽略。
这种方法的好处是只能读取一次文件。
答案 1 :(得分:2)
@Kolmar's idea的变体:
def all_but_last(it):
buf = next(it)
for item in it:
yield buf
buf = item
for line in all_but_last(...):
这是针对负面索引扩展islice
(双args版本)的更通用的代码:
import itertools, collections
def islice2(it, stop):
if stop >= 0:
for x in itertools.islice(it, stop):
yield x
else:
d = collections.deque(itertools.islice(it, -stop))
for item in it:
yield d.popleft()
d.append(item)
for x in islice2(xrange(20), -5):
print x,
# 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
答案 2 :(得分:1)
您可以使用大小为2的窗口进行迭代,并仅在窗口中打印第一个值。这将导致跳过最后一个元素:
from itertools import izip, tee
def pairwise(iterable):
a, b = itertools.tee(iterable)
next(b, None)
return izip(a, b)
for row, _ in pairwise(raw_data):
output_data.writerow(row)
output_data.writerow([])
答案 3 :(得分:0)
一个想法是计算你迭代的每一行的长度,然后当到达最后一行时截断文件,从而缩短文件&#34;。不确定这是不是很好的做法......
答案 4 :(得分:0)
不要在每次循环迭代中写入当前行,而是尝试编写先前读取的行:
import csv
raw_data = csv.reader(open("original_data.csv", "r"), delimiter=",")
output_data = csv.writer(open("final_data.csv", "w"), delimiter=",")
last_iter = (None, None)
try:
last_iter = (0, raw_data.next())
except StopIteration:
# The file is empty
pass
else:
for new_row in raw_data:
i, row = last_iter
last_iter = (i + 1, new_row)
if i == 10:
# some operations
output_data.writerow(row)
elif i > 10:
# some operations
output_data.writerow(row)
# Here, the last row of the file is in the `last_iter` variable.
# It won't get written into the output file.
output_data.writerow([])