我已经在读取excel文件的地方编写了代码,然后在处理了所需的功能之后,我想将其写入Excel文件。现在,我已经为一个Excel文件完成了此操作。现在我的问题是当我要为读取多个Excel文件的多个Excel文件执行此操作,然后输出也应位于多个Excel文件中时,我将如何在此处申请循环,以便为每个输入文件获得单独的输出Excel文件< / p>
以下是我的代码
from ParallelP import *
import time,json
import pandas as pd
if __name__ == '__main__':
__ip__ = "ip/"
__op__ = "op/"
__excel_file_name__ = __ip__ + '80chars.xlsx'
__prediction_op__ = __op__ + basename(__excel_file_name__) + "_processed.xlsx"
df = pd.read_excel(__excel_file_name__)
start_time = time.time()
df_preprocessed = run(df)
print("Time Needed to execute all data is {0} seconds".format((time.time() - start_time)))
print("Done...")
df_preprocessed.to_excel(__prediction_op__)
答案 0 :(得分:0)
我确实写了一些代码。也许您可以根据自己的需求进行更改。
# This is where your input file should be
in_folder = 'input/xls/file/folder'
# This will be your output folder
out_folder = 'output/xls/file/folder'
if not os.path.exists(out_folder):
os.makedirs(out_folder)
file_exist = False
dir_list = os.listdir(in_folder)
for xlfile in dir_list:
if xlfile.endswith('.xlsx') or xlfile.endswith('.xls'):
file_exist = True
str_file = os.path.join(in_folder, xlfile)
#work_book = load_workbook(filename=str_file)
#work_sheet = work_book['qa']
#Do ur work hear with excel
#out_Path = os.path.join(out_folder,)
#and output it to the out_Path
if not file_exist:
print('cannot find any valid excel file in the folder ' + in_folder)
答案 1 :(得分:0)
我尝试坚持您的示例,并按照我的意愿进行扩展。下面的示例未经测试,并不意味着这是最好的方法!
from ParallelP import *
import time,json
import pandas as pd
import os
from pathlib import Path # Handles directory paths -> less error prone than manually sticking together paths
if __name__ == '__main__':
__ip__ = "ip/"
__op__ = "op/"
# Get a list of all excel files in a given directory
excel_file_list = [f for f in os.listdir(__ip__) if f.endswith('.xlsx')]
# Loop over the list and process each excel file seperately
for excel_file in excel_file_list:
excel_file_path = Path(__ip__, excel_file) # Create the file path
df = pd.read_excel(str(excel_file)) # Read the excel file to data frame
start_time = time.time()
df_preprocessed = run(df) # Run your routine
print("Time Needed to execute all data is {0} seconds".format((time.time() - start_time)))
print("Done...")
# Create the output file name
prediction_output_file_name = '{}__processed.xlsx'.format(str(excel_file_path.resolve().stem))
# Create the output file path
prediction_output_file_path = str(Path(__op__, prediction_output_file_name))
# Write the output to the excel file
df_preprocessed.to_excel(prediction_output_file_path)
旁注:我不得不提到您的变量名感觉像是__的误用。这些'dunder'函数是特殊的,表示它们对python(see for example here)具有含义。请仅将变量input_dir
和output_dir
分别命名为__ip__
和__op__
。