熊猫:从另一列填充分区的第一行?

时间:2019-09-11 15:01:52

标签: python pandas dataframe

我有以下数据框,当我用Inventory的相邻行中的值对Product列进行分组时,要填充Stock列的第一个空单元格列。

   Year  Week Product  Stock  Inventory
0  2019    21       A     10        NaN
1  2019    22       A     10       34.0
2  2019    23       A     10        NaN
3  2019    24       A     10       28.0
4  2019    25       C     20        NaN
5  2019    26       C     20       39.0
6  2019    27       C     20        NaN
7  2019    28       B     35        NaN
8  2019    29       B     35        NaN
9  2019    30       B     35       94.0

最终输出应如下所示

   Year  Week Product  Stock  Inventory
0  2019    21       A     10       10.0
1  2019    22       A     10       34.0
2  2019    23       A     10        NaN
3  2019    24       A     10       28.0
4  2019    25       C     20       20.0
5  2019    26       C     20       39.0
6  2019    27       C     20        NaN
7  2019    28       B     35       35.0
8  2019    29       B     35        NaN
9  2019    30       B     35       94.0

数据

import pandas as pd
import numpy as np

data = {
    "Year": [2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019],
    "Week": [21, 22, 23, 24, 25, 26, 27, 28, 29, 30],
    "Product": ["A", "A", "A", "A", "C", "C", "C", "B", "B", "B"],
    "Stock": [10, 10, 10, 10, 20, 20, 20, 35, 35, 35],
    "Inventory": [np.NaN, 34, np.NaN, 28, np.NaN, 39, np.NaN, np.NaN, np.NaN, 94]
}

df = pd.DataFrame(data)

print(df)

2 个答案:

答案 0 :(得分:3)

这是在combine_first之后使用drop_duplicates的一种方法

df.Inventory=df.Inventory.combine_first(df.drop_duplicates(['Product']).Stock)
df
Out[193]: 
   Year  Week Product  Stock  Inventory
0  2019    21       A     10       10.0
1  2019    22       A     10       34.0
2  2019    23       A     10        NaN
3  2019    24       A     10       28.0
4  2019    25       C     20       20.0
5  2019    26       C     20       39.0
6  2019    27       C     20        NaN
7  2019    28       B     35       35.0
8  2019    29       B     35        NaN
9  2019    30       B     35       94.0

答案 1 :(得分:1)

鉴于产品组合在一起,您可以使用逻辑来更新库存:

first_with_na = (df.Product.ne(df.Product.shift()) # first product row
                 & df.Inventory.isna()             # Inventory is na
                )

df.loc[first_with_na, 'Inventory'] = df.Stock

输出:

   Year  Week Product  Stock  Inventory
0  2019    21       A     10       10.0
1  2019    22       A     10       34.0
2  2019    23       A     10        NaN
3  2019    24       A     10       28.0
4  2019    25       C     20       20.0
5  2019    26       C     20       39.0
6  2019    27       C     20        NaN
7  2019    28       B     35       35.0
8  2019    29       B     35        NaN
9  2019    30       B     35       94.0