I have a Receipt Journal DataFrame that contains about 400,000 rows of the following: Snippet1
The row which contains a value for the "Tender" column (e.g., Cash, CrCd) is the receipt total. The rows that follow are the items in that transaction. I effectively want to match each of these items to the number from the receipt total row in a new column, yielding the following: Snippet2
I was able to achieve this in Excel by setting cell O2 to =IF(N2="",O1,C2)
and dragging. Ideally I would like to avoid any use of Excel to manipulate the data.
Is there a way to do this in Pandas without using iterrows()
or itertuples()
? Both of these took exponential time to complete.
UPDATE: Here is the comma-delimited text of the dataframe for testing:
Company Name,Str,Rcpt#,Rcpt Date,Time,Ext O P$,Disc %,Ext D$,Ext P$,Rcpt T$,Shipping w/T,Fee $ w/T,Rcpt Total,Tender
,2,32381,4/5/2015,5:51p,1.96,0,0,1.96,0.04,0,0,2,Cash
,2683,18924,VC,,Item_Desc,1,1.5,0,0.25,,,,
,2713,505101,VC1,C12A,Item_desc,1,0.46,0,0.12,,,,
,,32382,4/5/2015,6:01p,18.3,0,0,18.3,1.7,0,0,20,CrCd
,3034,502201,AC,,Item_desc,1,9.15,0,3.36,,,,
,3034,502201,AC5,,Item_desc,1,9.15,0,3.36,,,,
,,32383,4/5/2015,6:08p,9.15,0,0,9.15,0.85,0,0,10,Cash
,3034,502201,AC5,,Item_Desc,1,9.15,0,3.36,,,,
,,32384,4/5/2015,6:13p,18.3,0,0,18.3,1.7,0,0,20,CrCd
,2212,505201,GV,J25A,Item_desc,1,9.15,0,1.56,,,,
,2212,505201,GV,J25A,Item_desc,1,9.15,0,1.56,,,,
,,32385,4/5/2015,6:15p,4.5,0,0,4.5,0,0,0,4.5,Cash
,4619,18924,VC,,Item_desc,1,4.5,0,0.5,,,,
,,32386,4/5/2015,6:15p,4.5,0,0,4.5,0,0,0,4.5,Cash
,4619,18924,VC,,Item_desc,1,4.5,0,0.5,,,,
答案 0 :(得分:3)
UPDATE:
In [11]: df['ReceiptNumber'] = (df.assign(ReceiptNumber=np.where(pd.notnull(df.Tender),
....: df['Rcpt#'],
....: np.nan))['ReceiptNumber']
....: .fillna(method='pad')
....: .astype(int))
In [12]: df[['Rcpt#','Tender','ReceiptNumber']]
Out[12]:
Rcpt# Tender ReceiptNumber
0 32381 Cash 32381
1 18924 NaN 32381
2 505101 NaN 32381
3 32382 CrCd 32382
4 502201 NaN 32382
5 502201 NaN 32382
6 32383 Cash 32383
7 502201 NaN 32383
8 32384 CrCd 32384
9 505201 NaN 32384
10 505201 NaN 32384
11 32385 Cash 32385
12 18924 NaN 32385
13 32386 Cash 32386
14 18924 NaN 32386
OLD answer:
df.assign(ReceiptNumber=np.where(pd.notnull(df.Tender),
df['Rcpt#'],
np.nan))['ReceiptNumber']
.fillna(method='pad')
PS this snippet wasn't tested as you didn't provide your data set in the text form, so i couldn't copy & paste it