下面的脚本读取多个csv文件,合并其中的一些文件并将它们写入到两个不同工作表中的Excel文件中。
它还会将此公式(=IF(COUNTIFS(Meds!A:A,B2)>0,1,0)
)添加到Meds列中每个单元格的最后一列,但是我需要将其递增,因此第二个单元格将是=IF(COUNTIFS(Meds!A:A,B3)>0,1,0)
,依此类推。我不知道如何编写一个循环来做到这一点。我看到了这个post,但在使用openpyxl时遇到了问题,因此想避免使用该库。
import pandas as pd
# read in multiple csv files
df1 = pd.read_csv("file1.csv", encoding = 'utf-8')
df2 = pd.read_csv("file2.csv", encoding = 'utf-8')
meds = pd.read_csv("meds.csv", encoding = 'utf-8')
# create a list of dataframes (excluding meds)
dfs = [df1, df2]
# merge dataframes in list
df_final = reduce(lambda left,right: pd.merge(left,right,on='RecordKey'), dfs)
# add empty column
df_final["Meds"] = ""
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('final.xlsx', engine='xlsxwriter')
# add formula to Meds
df_final['Meds'] = '=IF(COUNTIFS(Sheet2!A:A,E2)>0,1,0)'
# write to csv
df_final.to_excel(writer, sheet_name='Combined')
meds.to_excel(writer, sheet_name='Meds')
# Close the Pandas Excel writer and output the Excel file.
writer.save()
答案 0 :(得分:1)
您可以使用循环和字符串格式来创建可插入df
中的公式列表。
length_of_df = len(df)
list_of_formulas = []
for i in range(2,length_of_df+2):
formula = '=IF(COUNTIFS(Sheet2!A:A,E{0}>0,1,0)'.format(i)
list_of_formulas.append(formula)
# print(list_of_formulas)
# ['=IF(COUNTIFS(Sheet2!A:A,E2>0,1,0)',
# '=IF(COUNTIFS(Sheet2!A:A,E3>0,1,0)',
# '=IF(COUNTIFS(Sheet2!A:A,E4>0,1,0)',
# '=IF(COUNTIFS(Sheet2!A:A,E5>0,1,0)',
# '=IF(COUNTIFS(Sheet2!A:A,E6>0,1,0)']
# Assign list of formulas to df
df.loc[:, "Meds"] = list_of_formulas