我有一个很大的单列csv文件,我想分成很小的块:
1
2
3
4
5
6
7
8
9
10
这样输出csv应如下所示:
1 | 3 | 6 | 8 |
2 | 4 | 7 | 9 |
* | 5 | * | 10 |
和
*
表示列中没有数字。
任何人都可以在这方面帮助我。 谢谢
答案 0 :(得分:1)
一个丑陋但简单的解决方案:
import csv
from itertools import izip
def split_2_3(filename):
with open(filename) as f:
it = (line.strip() for line in f)
while True:
yield next(it), next(it), "*"
yield next(it), next(it), next(it)
with open("output.csv", "w") as f:
output = csv.writer(f, delimiter="|")
output.writerows(izip(*split_2_3("input.txt")))
答案 1 :(得分:1)
可在输入文件中保存任意行数的解决方案:
import csv
def split_2_3(filename,cnt = 0):
dd = {0:[],1:[],2:[]}
ecr = []
with open(filename) as f:
for i,line in enumerate(f):
ecr.append('\nline %d : %r\n' % (i,line))
if (i+cnt-2)%6:
ecr.append('%r put in dd[%d]\n'
% (line.strip(),(i+cnt)%3))
dd[(i+cnt)%3].append(line.strip())
else:
ecr.append("'*' put in dd[%d]"
% ((i+cnt)%3 ))
dd[(i+cnt)%3].append('*')
cnt += 1
ecr.append(' and %r put in dd[%d]\n'
% (line.strip(),(i+cnt)%3))
dd[(i+cnt)%3].append(line.strip())
print ''.join(ecr)
yield dd[0]
yield dd[1]
yield dd[2]
with open("output.csv", "wb") as f:
output = csv.writer(f, delimiter="|")
output.writerows(split_2_3("input.txt"))
例如,输入文件有13行:
line 0 : '1\n'
'1' put in dd[0]
line 1 : '2\n'
'2' put in dd[1]
line 2 : '3\n'
'*' put in dd[2] and '3' put in dd[0]
line 3 : '4\n'
'4' put in dd[1]
line 4 : '5\n'
'5' put in dd[2]
line 5 : '6\n'
'6' put in dd[0]
line 6 : '7\n'
'7' put in dd[1]
line 7 : '8\n'
'*' put in dd[2] and '8' put in dd[0]
line 8 : '9\n'
'9' put in dd[1]
line 9 : '10\n'
'10' put in dd[2]
line 10 : '11\n'
'11' put in dd[0]
line 11 : '12\n'
'12' put in dd[1]
line 12 : '13\n'
'*' put in dd[2] and '13' put in dd[0]
line 13 : '\n'
'' put in dd[1]
和输出CSV文件中的结果:
1|3|6|8|11|13
2|4|7|9|12|
*|5|*|10|*
回答发送者指出的内容(注意我在代码中添加了ljust(5)以使output.csv的内容显示更清晰:
import csv
def split_2_3(filename,cnt = 0):
dd = {0:[],1:[],2:[]}
ecr = []
with open(filename) as f:
for i,line in enumerate(f):
ecr.append('\nline %d : %r\n' % (i,line))
if (i+cnt-2)%6:
ecr.append('%r put in dd[%d]\n'
% (line.strip(),(i+cnt)%3))
dd[(i+cnt)%3].append(line.strip().rjust(5))
else:
ecr.append("'*' put in dd[%d]"
% ((i+cnt)%3 ))
dd[(i+cnt)%3].append('*'.rjust(5))
cnt += 1
ecr.append(' and %r put in dd[%d]\n'
% (line.strip(),(i+cnt)%3))
dd[(i+cnt)%3].append(line.strip().rjust(5))
while (i+cnt)%3!=2:
i += 1
dd[(i+cnt)%3].append('*'.rjust(5))
print ''.join(ecr)
yield dd[0]
yield dd[1]
yield dd[2]
with open("output.csv", "wb") as f:
output = csv.writer(f, delimiter="|")
output.writerows(split_2_3("input.txt"))
文件input.txt包含:
'1 \ r \ N 2 \ r \ N3 \ r \ N4 \ r \ N5 \ r \ N6 \ r \ N7 \ r \ n8 \ r \ N9 \ r \ N10 \ r \ N11'
创建的ouput.csv是
1 |3 |6 |8 |11
2 |4 |7 |9 |*
* |5 |* |10 |*
文件input.txt包含:
'1 \ r \ N 2 \ r \ N3 \ r \ N4 \ r \ N5 \ r \ N6 \ r \ N7 \ r \ n8 \ r \ N9 \ r \ N10 \ r \ N11 \ r \ N'
结果output.csv是相同的
文件input.txt包含:
“1 \ r \ N 2 \ r \ N3 \ r \ N4 \ r \ N5 \ r \ N6 \ r \ N7 \ r \ n8 \ r \ N9 \ r \ N10 \ r \ N11 \ r \ n \ r \ n “
结果是:
1 |3 |6 |8 |11
2 |4 |7 |9 |
* |5 |* |10 |*
文件input.txt包含:
“1 \ r \ N 2 \ r \ N3 \ r \ N4 \ r \ N5 \ r \ N6 \ r \ N7 \ r \ n8 \ r \ N9 \ r \ N10 \ r \ N11 \ r \ n \ r \ n \ r \ N”
ouput.txt变为:
1 |3 |6 |8 |11 |
2 |4 |7 |9 | |*
* |5 |* |10 |* |*
如果行中有空格而不是'\ r \ n',结果在视觉上是相同的,但output.csv文件中记录的值将是空白