如何打开Excel文件,将数据从数据框列写入列,另存为新文件

时间:2019-02-11 15:49:51

标签: python excel pandas

我有一个Excel文件,其中包含带有标题“原始翻译”的列。根据我使用的语言和一些操作,我还有一个带有“原始翻译-{language}”列的DataFrame。

我的目标是打开Excel文件,将标题为“原始翻译”的列与我DataFrame列“原始翻译-{language}”中的所有数据一起覆盖,保留原始Excel文件格式,然后保存到新的输出文件夹。

这是我当前拥有的代码:

def output_formatted_capstan_file(df, original_file, country, language):
    # this is where you generate the file:
    # https://stackoverflow.com/questions/20219254/how-to-write-to-an-existing-excel-file-without-overwriting-data-using-pandas
    try:
        print(original_file)
        book = load_workbook(original_file)
        writer = pd.ExcelWriter(original_file, engine='openpyxl')
        writer.book = book
        df.to_excel(writer, ['Original Translation - {}'.format(language)])
        writer.save()
    except:
        print('Failed')

1 个答案:

答案 0 :(得分:1)

我将使用以下方法进行处理。

1)使用诸如pandas.read_excel之类的函数导入excel文件,从而将excel中的数据导入数据帧。我称这个为exceldf

2)将此数据框与您在Pandas数据框中已有的数据合并。我将调用您现有的已翻译数据框translateddf

3)重新排序新合并的数据帧newdf,然后导出数据。以下显示了有关如何重新排序的更多选项:re-ordering data frame

4)将数据导出到Excel。我将把它集成到您​​的初始代码中。对于该问题的一般性答案,其他人可能希望在to_excel

处查看集成的Pandas选项。

示例代码

import pandas

# Read in the Excel file
exceldf = pandas.read_excel(open('your_xls_xlsx_filename'), sheetname='Sheet 1')

# Create a new dataframe with your merged data, merging on 'key1'.
# We then drop the column of the original translation, as it should no longer be needed
# I've included the rename argument in case you need it.
newdf = exceldf.merge(translateddf, left_on=['key1'], \
                right_on=['key1']) \
.rename(columns={'Original Translation {language}': 'Original Translation {language}'}) \
.drop(['Original Translation'], axis=1)

# Re-order your data. 
# Note that if you renamed anything above, you have to update it here too
newdf = newdf[['0', '1', '2', 'Original Translation {language}']]

# An example export, that uses the generic implementation, not your specific code
pandas.newdf.to_excel("output.xlsx")