我有两个数据框。第二数据帧是从第一数据帧派生的。我在第二个数据框中更新一列,然后将更新后的值放回第一个数据框中。我已经尝试过“合并”,但是它为我提供了带有后缀“ _x”和“ _y”的两列
import pandas
lotQtyQueryForDF = pandas.read_sql_query(refreshQuery,conForInfo)
dataFrameOfLots = pandas.DataFrame(lotQtyQueryForDF,columns=['Customer','Stage','ProdType','Brand','ProdName','Size','Strength','Lot','PackedOn','Qty','Available'])
dataFrameOfLots['Available']=dataFrameOfLots["Available"].fillna(dataFrameOfLots['Qty'])
#inserting columns
dataFrameOfLots['QtyInTransaction']=0
dataFrameOfLots['IndexCol'] = range(1, len(dataFrameOfLots) + 1)
dataFrameFiltered=dataFrameOfLots.query('Brand=="XYZ" & Customer=="ABC"')
dataFrameFiltered.loc[:,'Qty in transaction']=34
dataFrameFiltered2=dataFrameFiltered[['Qty in transaction','IndexCol']].copy()
dataFrameOfLots.merge(dataFrameFiltered2,on='IndexCol',how='outer')
输入数据集:
Customer Stage ProdType Brand ProdName Size Strength Lot PackedOn Qty Available
DEF A Bulk YYY Test Test Weak 1 20200101 10 5
ABC A Bulk XYZ Test Test Weak 1 20200101 10 5
GHI A Bulk YTY Test Test Weak 1 20200101 10 5
ABC B RAW XYZ Test Test Weak 1 20200101 10 5
实际输出:
Customer Stage ProdType Brand ProdName Size Strength Lot PackedOn Qty Available QtyInTransaction_x IndexCol QtyInTransaction_y
DEF A Bulk YYY Test Test Weak 1 20200101 10 5 0 1 0
ABC A Bulk XYZ Test Test Weak 1 20200101 10 5 0 2 34
GHI A Bulk YTY Test Test Weak 1 20200101 10 5 0 3 0
ABC B RAW XYZ Test Test Weak 1 20200101 10 5 0 4 34
预期输出:
Customer Stage ProdType Brand ProdName Size Strength Lot PackedOn Qty Available IndexCol QtyInTransaction
DEF A Bulk YYY Test Test Weak 1 20200101 10 5 1 0
ABC A Bulk XYZ Test Test Weak 1 20200101 10 5 2 34
GHI A Bulk YTY Test Test Weak 1 20200101 10 5 3 0
ABC B RAW XYZ Test Test Weak 1 20200101 10 5 4 34
查询方法正确吗? 我将如何合并以便只显示一列?
谢谢
答案 0 :(得分:1)
在执行过滤器后,请尝试使用外部合并并删除不需要的行。下面的代码。
result=pd.merge(dataFrameOfLots, dataFrameFiltered, how='outer', on=['Customer', 'Stage', 'ProdType', 'Brand', 'ProdName', 'Size',
'Strength', 'Lot', 'PackedOn', 'Qty', 'Available'],suffixes=('_x', '')).fillna(0)
result=result.loc[:,~result.columns.str.endswith('_x')]#drop unwanted columns
或
result.drop(columns=['QtyInTransaction_x','IndexCol_x'], inplace=True)#drop unwanted columns