我如何将.tsv转换为.csv?

时间:2015-04-20 22:07:44

标签: python csv tsv

尝试将.tsv转换为.csv。这样:

import csv

# read tab-delimited file
with open('DataS1_interactome.tsv','rb') as fin:
    cr = csv.reader(fin, delimiter='\t')
    filecontents = [line for line in cr]

# write comma-delimited file (comma is the default delimiter)
with open('interactome.csv','wb') as fou:
    cw = csv.writer(fou, quotechar='', quoting=csv.QUOTE_NONE)
    cw.writerows(filecontents)

给我这个错误:

  File "tsv2csv.py", line 11, in <module>
    cw.writerows(filecontents)
_csv.Error: need to escape, but no escapechar set

5 个答案:

答案 0 :(得分:4)

import pandas as pd 
tsv_file='name.tsv'
csv_table=pd.read_table(tsv_file,sep='\t')
csv_table.to_csv('new_name.csv',index=False)

我们可以使用上面的代码将.tsv文件转换为.csv文件

答案 1 :(得分:3)

尝试写入CSV文件时,遇到必须插入转义字符的令牌。但是,您尚未定义一个。

  

Dialect.escapechar

     

作者用来逃避的单字符字符串   如果引用设置为QUOTE_NONE且quotechar设置为分隔符,则为分隔符   doublequote是假的。在阅读时,escapechar删除任何特殊的   含义来自以下字符。默认为None,即   禁用转义。

来源:https://docs.python.org/2/library/csv.html#csv.Dialect.escapechar

示例代码:

# write comma-delimited file (comma is the default delimiter)
with open('interactome.csv','wb') as fou:
    cw = csv.writer(fou, quotechar='', quoting=csv.QUOTE_NONE, escapechar='\\')
    cw.writerows(filecontents)

答案 2 :(得分:1)

TSV是一种文件类型,其中字段由制表符分隔。 如果您想将TSV转换为CSV(逗号分隔值),您只需要执行查找并从TAB替换为COMMA

更新:
正如don-roby指出的那样,&#34;在tsv&#34;中可能有逗号,因为我们使用正则表达式来转义rfc4180所定义的所有csv特殊字符。

即:

import re
tsv = open('tsv.tsv', 'r')
fileContent =  tsv.read()
appDesc = re.sub("""(?ism)(,|"|')""", r"\\\1", appDesc) # escape all especial charaters (" ' ,) rfc4180
fileContent = re.sub("\t", ",", fileContent) # convert from tab to comma
csv_file = open("csv.csv", "w")
csv_file.write(fileContent)
csv_file.close()

答案 3 :(得分:0)

import pandas as pd
file_path = "/DataS1_interactome.tsv"
DataS1_interactome.csv = pd.read_csv(file_path, sep="\t")

答案 4 :(得分:-1)

import sys
import csv

tabin = csv.reader(open('sample.txt'), dialect=csv.excel_tab)
commaout = csv.writer(open('sample.csv', 'wb'), dialect=csv.excel)

for row in tabin:
  commaout.writerow(row)