我想使用python将多个csv文件转换为txt而不会丢失列对齐。
csv文件的示例,以逗号分隔,没有空格或制表符,如下所示:
"Products" "Technologies" Region1 Region2 Region3
Prod1 Tech1 16 0 12
Prod2 Tech2 0 12 22
Prod3 Tech3 22 0 36
但是使用我的脚本我最终会得到以下结果:
"Products" "Technologies" Region1 Region2 Region3
Prod1 Tech1 16 0 12
Prod2 Tech2 0 12 22
Prod3 Tech3 22 0 36
分隔符的选择是任意的。是否有一种相对简单的方法来实现我想要的,考虑到包含csv文件的表的尺寸会有所不同,列标题的长度会有所不同?
我使用以下python代码:
import os
import fileinput
dire = "directory"
# function for converting csv files to txt
def csv_to_txt(names, txtfilename):
# remove existing txt file
if os.path.exists(dire + txtfilename + ".txt"):
os.remove(dire + txtfilename + ".txt")
# open the include file
includefile = open(dire + txtfilename + ".txt", "a")
# handle the csv files and convert to txt
with open(names, "a+") as input_file:
lines = [line.split(",", 2) for line in input_file.readlines()]
print lines
text_list = [" ".join(line) for line in lines]
for line in text_list:
includefile.write(line)
includefile.close()
csv_to_txt(dire + "01.csv", "nameofoutputfile")
for line in fileinput.FileInput(dire + "nameofoutputfile" + ".txt",inplace=1):
line = line.replace('"','')
line = line.replace(',',' ')
答案 0 :(得分:3)
CSV文件不带格式或对齐信息,只是用逗号分隔的数据。通常,将csv渲染为漂亮的表处理器作业。
要将文件读入列表或字典,请使用csv标准模块。为了获得漂亮打印的最佳效果,请使用PrettyTable或PTable fork https://pypi.python.org/pypi/PTable/0.9.0。其他工具包括https://pypi.python.org/pypi/tabulate或texttable https://oneau.wordpress.com/2010/05/30/simple-formatted-tables-in-python-with-texttable,https://pypi.python.org/pypi/beautifultable/。
使用PTable
from prettytable import from_csv
fp = open("myfile.csv", "r")
mytable = from_csv(fp)
fp.close()
mytable.border = False
print mytable.get_string()
对于一些简单的表格,也可以使用简单的snippet。
就个人而言,当我不得不打印一个没有额外麻烦的包时,我会使用一些特殊的字符串格式,但是软件包通常更加傻瓜证明,支持很多选项,所以如果你要处理许多表,它可能会值得努力。
Prettytable似乎是最受欢迎的(伟大的名字)。 将表格claims列为比大多数漂亮的桌面打印机更好的表现,除非你是一名火箭科学家,否则可以保存ascitable(现在astropy.io.ascii,所以可能有点矫枉过正)
答案 1 :(得分:0)
我制作了一个打开.csv的程序,并且(希望)完全符合您的要求:
import tkinter as tk
from tkinter import filedialog
import os
import csv as csv_package
def fileopen():
GUI=tk.Tk()
filepath=filedialog.askopenfilename(parent=GUI,
title='Select file')
(GUI).destroy()
return (filepath)
filepath = fileopen()
filepath = os.path.normpath(filepath)
data = []
with open(filepath) as fp:
reader = csv_package.reader(fp, skipinitialspace=True)
for row in reader:
data.append(row)
#make spreadsheet rows consistent length, based on longest row
max_len_row = len(max(data,key=len))
for row in data:
if len(row) < max_len_row:
append_number = max_len_row - len(row)
for i in range(append_number):
row.append('')
#create dictionary of number of columns
longest = {}
for times in range(len(data[0])):
longest [times] = 0
#get longest entry for each column
for sublist_index,sublist in enumerate(data):
for column_index,element in enumerate(sublist):
if longest [column_index] < len(element):
longest [column_index] = len(element)
#make each column as long as the longest entry
for sublist_index,sublist in enumerate(data):
for column_index,element in enumerate(sublist):
if len(element) < longest [column_index]:
amount_to_append = longest [column_index] - len(element)
data [sublist_index][column_index] += (' ' * amount_to_append)
with open(filepath, 'w', newline='') as csvfile:
writer = csv_package.writer(csvfile)
for row in data:
writer.writerow(row)
path, ext = os.path.splitext(filepath)
os.rename(filepath, path + '.txt')
之前:
"Products","Technologies",Region1,Region2,Region3
Prod1,Tech1,16,0,12
Prod2,Tech2,0,12,22
Prod3,Tech3,22,0,36
后:
Products,Technologies,Region1,Region2,Region3
Prod1 ,Tech1 ,16 ,0 ,12
Prod2 ,Tech2 ,0 ,12 ,22
Prod3 ,Tech3 ,22 ,0 ,36