我想将此file.tsv转换为csv 转换效果很好但是字段的分离并不是很好 这是file.tsv
protein1 protein2 neighborhood neighborhood_transferred fusion cooccurence homology coexpression coexpression_transferred experiments experiments_transferred database database_transferred textmining textmining_transferred combined_score
9606.ENSP00000003084 9606.ENSP00000301645 0 0 0 0 0 0 0 0 0 0 0 163 129 239
这是第一行结果file.csv
"protein1 protein2 neighborhood neighborhood_transferred fusion cooccurence homology coexpression coexpression_transferred experiments experiments_transferred database database_transferred textmining textmining_transferred combined_score"
"9606.ENSP00000003084 9606.ENSP00000301645 0 0 0 0 0 0 0 0 0 0 0 163 129 239"
这是代码
import csv
print(csv.list_dialects())
with open('File.tsv', 'r', encoding='utf-8', newline='') as fin, \
open('file2.csv', 'w', encoding='utf-8', newline='') as fout:
reader = csv.reader(fin, dialect='excel-tab')
writer = csv.writer(fout, delimiter=' ')
for row in reader:
writer.writerow(row)
问题是代码没有使用空格分隔字段,它将整个标题排成一行
期望的结果是分离应该是我放逗号的地方
protein1,protein2,neighborhood,neighborhood_transferred,fusion,cooccurence homology,coexpression,coexpression_transferred,experimental experiments_transferred,database,database_transferred,textmining,textmining_transferred,combined_score
9606.ENSP00000003084,9606.ENSP00000301645,0,0,0,0,0,0,0,0,0,0,0,163,129,239
答案 0 :(得分:0)
编辑:在与OP交换评论后重写。
指定输入以期望输入中的制表符为分隔符:
reader = csv.reader(fin, dialect='excel-tab')
但是没有标签,有空格,所以:
reader = csv.reader(fin, delimiter=' ')
请注意,这会将2个连续的空格视为两个分隔符,它们之间带有空字段。您无法以Excel的方式指定忽略重复分隔符。