熊猫在应用功能内转移

时间:2018-08-17 20:30:38

标签: python pandas

我试图用另一列和上一行的值填充某些NaN值。

例如,我得到一个看起来像这样的数据框:

    Distance  Down  firstDownYards  secondDownYards
1       10.0   1.0             NaN              NaN
2        8.0   2.0             2.0              NaN
3        8.0   3.0             2.0              0.0
4       19.0   3.0            -9.0            -11.0
5       19.0   4.0            -9.0            -11.0
6       10.0   1.0             NaN              NaN
7        5.0   2.0             5.0              NaN
8        5.0   3.0             5.0              0.0
9       10.0   1.0             NaN              NaN
10       9.0   2.0             1.0              NaN
11      11.0   3.0            -1.0             -2.0
12      12.0   4.0            -2.0             -3.0
13      10.0   1.0             NaN              NaN
14       5.0   2.0             5.0              NaN
15      10.0   1.0             NaN              NaN
16       8.0   2.0             2.0              NaN
17       8.0   3.0             2.0              0.0
18      10.0   1.0             NaN              NaN
19      10.0   2.0             0.0              NaN
20       6.0   3.0             4.0              4.0

在secondDownYards中,我想用向下列小于2的NaN填充与firstDownYards列的下一行相对的NaN。这是该列的示例:

    Distance  Down  firstDownYards  secondDownYards
1       10.0   1.0             NaN              -2    # Change here
2        8.0   2.0             2.0              NaN
3        8.0   3.0             2.0              0.0
4       19.0   3.0            -9.0            -11.0
5       19.0   4.0            -9.0            -11.0
6       10.0   1.0             NaN              -5    # Change here
7        5.0   2.0             5.0              NaN
8        5.0   3.0             5.0              0.0
9       10.0   1.0             NaN              -1    # Change here
10       9.0   2.0             1.0              NaN
11      11.0   3.0            -1.0             -2.0
12      12.0   4.0            -2.0             -3.0
13      10.0   1.0             NaN              -5    # Change here
14       5.0   2.0             5.0              NaN
15      10.0   1.0             NaN              -2    # Change here
16       8.0   2.0             2.0              NaN
17       8.0   3.0             2.0              0.0
18      10.0   1.0             NaN              0    # Change here
19      10.0   2.0             0.0              NaN
20       6.0   3.0             4.0              4.0

我试图创建一个看起来像这样的函数,但是当我尝试打印x.shift()时,它只打印与x相同的东西。然后,我将使用df.apply(getLastCol,args=(....),axis=1)。 downNb是条件,在此示例中为2。 currentCol和lastCol是当前列和上一列的名称。

def getLastCol(x,downNb,currentCol,lastCol):
    if x['Down'] < downNb:
        print(x.shift())
        value = x.shift(-1)[lastCol]
    else:
        value = x[currentCol]
    return value

1 个答案:

答案 0 :(得分:2)

就地使用shiftloc

df.loc[df.Down.lt(2), 'secondDownYards'] = df.firstDownYards.shift(-1).mul(-1)

您还提到了secondDownYards也必须为NaN的条件。在您的示例中,通常是这样,如果不能保证 且只希望替换NaN值,则还可以添加该检查:

df.loc[df.Down.lt(2) & df.secondDownYards.isnull(), 'secondDownYards'] # ...

使用np.whereassign

此选项的好处是不就地修改DataFrame:

df.assign(
    secondDownYards= np.where(
    df.Down.lt(2), df.firstDownYards.shift(-1).mul(-1), df.secondDownYards
))

这两个选项均会产生所需的输出:

    Distance  Down  firstDownYards  secondDownYards
1       10.0   1.0             NaN             -2.0
2        8.0   2.0             2.0              NaN
3        8.0   3.0             2.0              0.0
4       19.0   3.0            -9.0            -11.0
5       19.0   4.0            -9.0            -11.0
6       10.0   1.0             NaN             -5.0
7        5.0   2.0             5.0              NaN
8        5.0   3.0             5.0              0.0
9       10.0   1.0             NaN             -1.0
10       9.0   2.0             1.0              NaN
11      11.0   3.0            -1.0             -2.0
12      12.0   4.0            -2.0             -3.0
13      10.0   1.0             NaN             -5.0
14       5.0   2.0             5.0              NaN
15      10.0   1.0             NaN             -2.0
16       8.0   2.0             2.0              NaN
17       8.0   3.0             2.0              0.0
18      10.0   1.0             NaN             -0.0
19      10.0   2.0             0.0              NaN
20       6.0   3.0             4.0              4.0