我有一个如下所示的MySQL表:
ID | article | price | promo_price | delivery_days | stock | received_on
17591 03D/6H 3082.00 1716.21 30 0 2019-03-20
29315 03D/6H 3082.00 1716.21 26 0 2019-03-24
47796 03D/6H 3082.00 1716.21 24 0 2019-03-25
22016 L1620S 685.00 384.81 0 3 2019-03-20
35043 L1620S 685.00 384.81 0 2 2019-03-24
53731 L1620S 685.00 384.81 0 2 2019-03-25
我创建了一个数据透视表来监视库存数据。
md = df.pivot_table(
values='stock',
index=['article','price', 'promo_price','delivery_days'],
columns='received_on',
aggfunc=np.sum)
dates = md.columns.tolist()
dates.sort(reverse=True)
md = md[dates]
这是复活
+---------------------------------+--------------+--------------+--------------+
| | 2019-03-25 | 2019-03-24 | 2019-03-20 |
|---------------------------------+--------------+--------------+--------------|
| ('03D/6H', 3082.0, 1716.21, 24) | 0 | nan | nan |
| ('03D/6H', 3082.0, 1716.21, 26) | nan | 0 | nan |
| ('03D/6H', 3082.0, 1716.21, 30) | nan | nan | 0 |
| ('L1620S-KD', 685.0, 384.81, 0) | 2 | 2 | 3 |
+---------------------------------+--------------+--------------+--------------+
如何根据最近的库存接收日期过滤行并获取商品的价格,促销价格和交货日期?
例如:我想要所有天的库存信息,但价格,促销价格和交货日期仅为2019-03-25,如下所示
+---------------------------------+--------------+--------------+--------------+
| | 2019-03-25 | 2019-03-24 | 2019-03-20 |
|---------------------------------+--------------+--------------+--------------|
| ('03D/6H', 3082.0, 1716.21, 24) | 0 | nan | nan |
| ('L1620S', 685.0, 384.81, 0) | 2 | 2 | 3 |
+---------------------------------+--------------+--------------+--------------+
编辑:
如果价格,促销价格和交货天数没有变化,我将得到预期的结果。但是,如果值有任何变化,那么同一篇文章我将获得多行。
第 L1620S 条数据符合预期。但是文章 03D / 6H 导致三行。
答案 0 :(得分:1)
您可以使用:
df['received_on'] = pd.to_datetime(df['received_on'])
md = df.pivot_table(
values='stock',
index=['article','price', 'promo_price','delivery_days'],
columns='received_on',
aggfunc=np.sum)
#sorting columns in descending order
md = md.sort_index(axis=1, ascending=False)
#remove missing rows in first column
md = md.dropna(subset=[md.columns[0]])
#another solution
#md = md[md.iloc[:, 0].notna()]
print (md)
received_on 2019-03-25 2019-03-24 2019-03-20
article price promo_price delivery_days
03D/6H 3082.0 1716.21 24 0.0 NaN NaN
L1620S 685.0 384.81 0 2.0 2.0 3.0
编辑:首先按第一级过滤,然后按位置过滤-第一行:
md = md.sort_index(axis=1, ascending=False)
idx = pd.IndexSlice
md1 = md.loc[idx['03D/6H',:,:],:].iloc[[0]]
print (md1)
received_on 2019-03-25 2019-03-24 2019-03-20
article price promo_price delivery_days
03D/6H 3082.0 1716.21 24 0.0 NaN NaN