Excel1
Column1 Column2 Column3
1 A [xyz-5536]A
2 B [xyz-5537]B
3 C [xyz-5538]C
4 D [xyz-5539]D
这是一种具有不同项目的可追溯性报告
Excel2
Column1 Result1 Result2
[xyz-5536]A Pass fail
[xyz-5537]B fail pass
[xyz-5538]C pass fail
[xyz-5539]D fail pass
我有两个以上的excel,我想创建一个新的表格,其结果是excel1将通过在方括号中映射ID追加数据并追加result1和result2
我通过宏尝试了
Column1 Column2 Column3 Result1 Result2
1 A [xyz-5536]A pass fail
2 B [xyz-5537]B fail pass
3 C [xyz-5538]C pass fail
4 D [xyz-5539]D fail pass
它应该像上表一样。它一次只能引用column3
中的一项。我是Python的新手。
答案 0 :(得分:0)
一种简单的方法是使用pandas
模块。
您可以:
read_excel
方法(doc) merge
方法(doc) drop
方法(doc) to_excel
方法(doc) 以下代码假设您有两个Excel文件:
# import modules
import pandas as pd
# filename
filename_1 = "Excel1.xlsx"
filename_2 = "Excel2.xlsx"
# Pandas files
df_1 = pd.read_excel(filename_1)
df_2 = pd.read_excel(filename_2)
print(df_1)
# Column1 Column2 Column3
# 0 1 A [xyz-5536]A
# 1 2 B [xyz-5537]B
# 2 3 C [xyz-5538]C
# 3 4 D [xyz-5539]D
print(df_2)
# Column1 Result1 Result2
# 0 [xyz-5536]A Pass fail
# 1 [xyz-5537]B fail pass
# 2 [xyz-5538]C pass fail
# 3 [xyz-5539]D fail pass
# Merge the two dataframe
df_new = df_1.merge(df_2, left_on="Column3",
right_on="Column1", suffixes=('', '_df2'))
print(df_new)
# Column1 Column2 Column3 Column1_df2 Result1 Result2
# 0 1 A [xyz-5536] A[xyz-5536]A Pass fail
# 1 2 B [xyz-5537] B[xyz-5537]B fail pass
# 2 3 C [xyz-5538] C[xyz-5538]C pass fail
# 3 4 D [xyz-5539] D[xyz-5539]D fail pass
# Remove the duplicated columns
df_new = df_new.drop(["Column1_df2"], axis=1)
print(df_new)
# Column1 Column2 Column3 Result1 Result2
# 0 1 A[xyz-5536]A Pass fail
# 1 2 B[xyz-5537]B fail pass
# 2 3 C[xyz-5538]C pass fail
# 3 4 D[xyz-5539]D fail pass
# Export to a new xlsx file
df_new.to_excel("new_file.xlsx", index=False)