我有一个包含数据的~1000 .txt文件的文件夹,我喜欢以相同的方式绘制。这对我来说需要3个挑战: 1)将.txt转换为.xlsx(或.xls - 我不在乎) 2)一次性执行第1步,而不是为要转换的每个文件键入文件名 3)转换文件后,我需要excel将数字识别为数字而不是文本(即目前我必须手动输入"文本到列"在excel上运行,这对于2列是烦恼的~1000个电子表格)
我有两个代码已经接近(我在网上找到了一些其他代码)。
代码1:
import xlwt
import xlrd
import csv
import openpyxl
import xlsxwriter
mypath = 'C:/desktop/Text Documents/'
from os import listdir
from os.path import isfile, join
textfiles = [ join(mypath,f) for f in listdir(mypath) if isfile(join(mypath,f)) and '.txt' in f]
for textfile in textfiles:
f = open(textfile, 'r+')
row_list = []
for row in f:
row_list.append(row.split('\t'))
column_list = zip(*row_list)
workbook = xlwt.Workbook()
worksheet = workbook.add_sheet('Sheet1')
i = 0
for column in column_list:
for item in range(len(column)):
worksheet.write(item, i, column[item])
i+=1
workbook.save(textfile.replace('.txt', '.xls'))
上面是一个文本文件夹,并将它们转换为.xls,并对它们进行分隔,但不幸的是我仍然需要使用"文本到列"功能
代码2:
import csv
import openpyxl
import xlsxwriter
import xlrd
input_file = 'C:/desktop/Text Documents/thisismytextfilename.txt'
output_file = 'C:/desktop/Text Documents/thisismytextfilename.xlsx'
wb = openpyxl.Workbook()
ws = wb.worksheets[0]
with open(input_file, 'rb') as data:
reader = csv.reader(data, delimiter='\t')
for row in reader:
ws.append(row)
wb.save(output_file)
file_location = output_file
workbook = xlrd.open_workbook(file_location)
sheet = workbook.sheet_by_index(0) #2 indicates 3rd page
x = [sheet.cell_value(i+14, 0) for i in range(sheet.nrows-14)]
y = [sheet.cell_value(i+14, 1) for i in range(sheet.nrows-14)]
workbook = xlsxwriter.Workbook(file_location)
worksheet = workbook.add_worksheet()
bold = workbook.add_format({'bold': 1})
# Add the worksheet data that the charts will refer to.
headings = ['Time (s)', 'Load (kg)']
data = [x,y]
worksheet.write_row('A1', headings, bold)
worksheet.write_column('A2', data[0])
worksheet.write_column('B2', data[1])
chart1 = workbook.add_chart({'type': 'scatter'})
# Configure the first series.
chart1.add_series({
'name': '=Sheet1!$B$1',
'categories': '=Sheet1!$A$2:$A$25000',
'values': '=Sheet1!$B$2:$B$25000'})
chart1.set_x_axis({'name': 'Time'})
chart1.set_y_axis({'name': 'Load'})
chart1.set_style(1)
# Insert the chart into the worksheet (with an offset).
worksheet.insert_chart('D2', chart1, {'x_offset': 25, 'y_offset': 10})
workbook.close()
这将提取我想要的2列数据并创建绘图。但我必须复制/粘贴每个文件的文件名,我必须通过并点击"文本到列"每次。
答案 0 :(得分:1)
我必须经历并点击"文本到列"每次。
您可以使用XlsxWriter constructor参数strings_to_numbers
来避免这种情况:
workbook = xlsxwriter.Workbook(filename, {'strings_to_numbers': True})