我正在寻找解析一个csv文件,其组织如下:
<data1>,<data2>
asdf,<data3>
asdf,<data4>
asdf,<data5>
<data6>,<data7>
asdf,<data8>
<data1>,<data2>
asdf,<data3>
asdf,<data4>
asdf,<data5>
<data6>,<data7>
asdf,<data8>
<data1>,<data2>
asdf,<data3>
asdf,<data4>
asdf,<data5>
<data6>,<data7>
asdf,<data8>
etc.
我正在尝试输出看起来像这样的.csv:
<data1>,<data2>,<data3>,<data4>,<data6>,<data7>,<data8>
<data1>,<data2>,<data3>,<data4>,<data6>,<data7>,<data8>
etc.
有人可以帮我解决这个问题吗?
编辑:想出来,如果有人有兴趣..import csv
with open('C:\Temp\eqtest.csv', 'rb') as inf, open('C:\Temp\output.csv', 'wb') as outf:
reader = csv.reader(inf)
writer = csv.writer(outf)
i = -1
line = []
for row in reader:
print(line)
print(i)
print(row)
while row == ['','']:
row = next(reader)
i += 1
if i == 0 or i == 4:
line.append(row[0])
line.append(row[1])
elif i == 2 or i == 3:
line.append(row[1])
elif i == 5:
line.append(row[1])
i = -1
writer.writerow(line)
line = []
答案 0 :(得分:1)
您可以将csv.reader()
用作可迭代,并使用next()
或itertools.islice()
获取其他行:
import csv
from itertools import islice
with open('input.csv', 'rb') as inf, open('output.csv', 'wb') as outf:
reader = csv.reader(inf)
writer = csv.writer(outf)
for row in reader:
while not row:
# skip empty rows
continue
result = row
for extra_row in islice(reader, 3):
result.append(extra_row[1])
result.extend(next(reader))
result.append(next(reader)[1])
writer.writerow(result)
这将从阅读器中获取一行,并使用所有列作为输出行的开头。然后从同一个CSV中再拉3行以获取第二列,将其添加到输出行。使用next()
,将读取额外的两行,将整行和1列添加到输出中。
跳过每个6行块之前的任何空行。
然后读取输出,并且for
循环的下一次迭代可以开始,此时已经读取了6个实际行,并且循环从输入文件中获取第7行;如果这是空的,则读取器会前进,直到找到非空行。
演示:
>>> import csv
>>> import sys
>>> from itertools import islice
>>> sample = '''\
... <data1>,<data2>
... asdf,<data3>
... asdf,<data4>
... asdf,<data5>
... <data6>,<data7>
... asdf,<data8>
...
... <data1>,<data2>
... asdf,<data3>
... asdf,<data4>
... asdf,<data5>
... <data6>,<data7>
... asdf,<data8>
...
... <data1>,<data2>
... asdf,<data3>
... asdf,<data4>
... asdf,<data5>
... <data6>,<data7>
... asdf,<data8>
... '''.splitlines()
>>> reader = csv.reader(sample)
>>> writer = csv.writer(sys.stdout)
>>> for row in reader:
... while not row:
... # skip empty rows
... continue
... result = row
... for extra_row in islice(reader, 3):
... result.append(extra_row[1])
... result.extend(next(reader))
... result.append(next(reader)[1])
... writer.writerow(result)
...
<data1>,<data2>,<data3>,<data4>,<data5>,<data6>,<data7>,<data8>
<data1>,<data2>,<data3>,<data4>,<data5>,<data6>,<data7>,<data8>
<data1>,<data2>,<data3>,<data4>,<data5>,<data6>,<data7>,<data8>