我想使用缺少值的交易获得去年SALES_AMOUNT的值()
这是我的交易。
STORE TXN_YM SALES_AMOUNT
A 201303 16793.14
A 201305 42901.61
A 201306 63059.72
A 201310 168471.43
A 201311 58570.72
A 201312 67526.71
A 201402 50649.07
A 201406 48819.97
A 201407 97100.77
A 201409 67778.40
A 201410 90327.52
A 201411 75703.12
A 201412 26098.50
A 201501 81429.36
A 201502 19539.85
A 201503 71727.66
A 201504 20117.79
A 201506 44252.19
A 201507 68578.82
A 201508 91483.39
A 201510 39220.87
A 201511 12224.11
A 201601 55425.74
A 201604 82550.66
A 201605 95772.93
A 201606 43794.49
A 201607 158287.16
A 201608 92568.03
A 201609 43136.43
预期产出
STORE TXN_YM SALES_AMOUNT LY
A 201303 16793.14 NaN
A 201305 42901.61 NaN
A 201306 63059.72 NaN
A 201310 168471.43 NaN
A 201311 58570.72 NaN
A 201312 67526.71 NaN
A 201402 50649.07 NaN
A 201406 48819.97 63059.72
A 201407 97100.77 NaN
A 201409 67778.40 NaN
A 201410 90327.52 168471.43
A 201411 75703.12 58570.72
A 201412 26098.50 67526.71
A 201501 81429.36 NaN
A 201502 19539.85 50649.07
A 201503 71727.66 NaN <-- If shift() it will get 16793.14 of 201303 which is wrong
A 201504 20117.79 NaN
A 201506 44252.19 48819.97
A 201507 68578.82 97100.77
A 201508 91483.39 NaN
A 201510 39220.87 90327.52
A 201511 12224.11 75703.12
A 201601 55425.74 19539.85
A 201604 82550.66 20117.79
A 201605 95772.93 NaN <-- If shift() it will get 42901.61 of 201305 which is wrong
A 201606 43794.49 44252.19
A 201607 158287.16 68578.82
A 201608 92568.03 91483.39
A 201609 43136.43 NaN
我试图在pandas DataFrame: Get previous month value where there are missing transaction and cannot shift()中做一些类似我之前问过的问题,但它不起作用:(
我尝试将TXN_YM拆分为TXN_YEAR和TXN_MONTH,就像这样
STORE TXN_YM TXN_YEAR TXN_MONTH SALES_AMOUNT
A 201303 2013 3 16793.14
A 201305 2013 5 42901.61
A 201306 2013 6 63059.72
A 201310 2013 10 168471.43
A 201311 2013 11 58570.72
到目前为止,这是我最好的
这是错误的,201503将取值201303而不是NaN
df["LY1"] = df.groupby(["STORE", "TXN_MONTH"])["SALES_AMOUNT"].shift()
我相信它会起作用,但事实并非如此,它显示了我根本无法获得的任意数字
def get_value_ly(x):
y = x["SALES_AMOUNT"].shift() * x["TXN_YEAR"].diff().eq(1)
return y
df["LY"] = df.groupby(["STORE", "TXN_MONTH"]).apply(lambda x: get_value_ly(x))
结果
STORE TXN_YM SALES_AMOUNT LY
A 201303 16793.14 NaN
A 201305 42901.61 81429.36
A 201306 63059.72 NaN
A 201310 168471.43 50649.07
A 201311 58570.72 NaN
A 201312 67526.71 NaN
A 201402 50649.07 NaN
A 201406 48819.97 20117.79
A 201407 97100.77 NaN
A 201409 67778.40 NaN
A 201410 90327.52 NaN
A 201411 75703.12 63059.72
A 201412 26098.50 48819.97
我不知道为什么它不起作用:( 请帮我解决这个问题。