我有一个简单的数据框(df1),其中用替换函数替换值(请参见下文)。我希望不必从代码中更改要替换的项的名称,而是希望从excel工作表中完成此操作,在该工作表中,列或行都给出了应替换的不同名称。我会将Excel导入为数据框(df2)。我所缺少的只是将信息从df2转换为替换功能的脚本。
df1 = pd.DataFrame({'Product':['Tart', 'Cookie', 'Black'],
'Quantity': [1234, 4, 333]})
print(df1)
Product Quantity
0 Tart 1234
1 Cookie 4
2 Black 333
这是我到目前为止使用的
sales = sales.replace (["Tart","Tart2", "Cookie", "Cookie2"], "Tartlet")
sales = sales.replace (["Ham and cheese Sandwich" , "Chicken focaccia"], "Sandwich")
更换后
print(df1)
Product Quantity
0 Tartlet 1234
1 Tartlet 4
2 Black 333
这是从Excel文件导入数据框2后的样子(我很灵活地设计它)
df2 = pd.read_excel (setup_folder / "Product Replacements.xlsx", index_col= 0)
print (df2)
Tartlet Sandwich
0 Tart Ham and cheese Sandwich
1 Tart2 Chicken Focaccia
2 Cookie2 nan
答案 0 :(得分:1)
使用:
df2 = pd.DataFrame({'Tartlet':['Tart', 'Tart2', 'Cookie'],
'Sandwich': ['Ham and Cheese Sandwich', 'Chicken Focaccia', 'another']})
#swap key values in dict
#http://stackoverflow.com/a/31674731/2901002
d1 = {k: oldk for oldk, oldv in df2.items() for k in oldv}
print (d1)
{'Tart': 'Tartlet', 'Tart2': 'Tartlet', 'Cookie': 'Tartlet', 'Ham and Cheese Sandwich':
'Sandwich', 'Chicken Focaccia': 'Sandwich', 'another': 'Sandwich'}
df1['Product'] = df1['Product'].replace(d1)
#for improve performance
#df1['Product'] = df1['Product'].map(d1).fillna(df1['Product'])
print (df1)
Product Quantity
0 Tartlet 1234
1 Tartlet 4
2 Black 333