Question

我有一个在不同地方缺少值的数组。

accept

对于每个import numpy as np import pandas as pd x = np.arange(1,10).astype(float) x[[0,1,6]] = np.nan df = pd.Series(x) print(df) 0 NaN 1 NaN 2 3.0 3 4.0 4 5.0 5 6.0 6 NaN 7 8.0 8 9.0 dtype: float64，我想取值继续它，将它除以2。然后将其传播到下一个连续的NaN，所以我最终得到：

NaN

我已尝试过0 0.75 1 1.5 2 3.0 3 4.0 4 5.0 5 6.0 6 4.0 7 8.0 8 9.0 dtype: float64，但这似乎与连续的NaN无关。

Answer 1

fillna方法ffill的另一个解决方案，与ffill()功能相同：

#back order of Series
b = df[::-1].isnull()
#find all consecutives NaN, count them, divide by 2 and replace 0 to 1
a = (b.cumsum() - b.cumsum().where(~b).ffill()).mul(2).replace({0:1})

print(a)
8    1
7    1
6    2
5    1
4    1
3    1
2    1
1    2
0    4
dtype: int32

print(df.bfill().div(a))
0    0.75
1    1.50
2    3.00
3    4.00
4    5.00
5    6.00
6    4.00
7    8.00
8    9.00
dtype: float64

计时（len(df)=9k）：

In [315]: %timeit (mat(df))
100 loops, best of 3: 11.3 ms per loop

In [316]: %timeit (jez(df1))
100 loops, best of 3: 2.52 ms per loop

时间安排的代码：

import numpy as np
import pandas as pd
x = np.arange(1,10).astype(float)
x[[0,1,6]] = np.nan
df = pd.Series(x)
print(df)
df = pd.concat([df]*1000).reset_index(drop=True)
df1 = df.copy()

def jez(df):
    b = df[::-1].isnull()
    a = (b.cumsum() - b.cumsum().where(~b).ffill()).mul(2).replace({0:1})
    return (df.bfill().div(a))

def mat(df):
    prev = 0
    new_list = []
    for i in df.values[::-1]:
        if np.isnan(i):
            new_list.append(prev/2.)    
            prev = prev / 2.
        else:
            new_list.append(i)
            prev = i
    return pd.Series(new_list[::-1])

print (mat(df))
print (jez(df1))

Answer 2

您可以这样做：

import numpy as np
import pandas as pd
x = np.arange(1,10).astype(float)
x[[0,1,6]] = np.nan
df = pd.Series(x)

prev = 0
new_list = []
for i in df.values[::-1]:
    if np.isnan(i):
        new_list.append(prev/2.)    
        prev = prev / 2.
    else:
        new_list.append(i)
        prev = i
df = pd.Series(new_list[::-1])

它反过来循环df的值。它跟踪以前的值。如果它不是NaN，它会添加实际值，否则为前一个值的一半。

这可能不是最复杂的Pandas解决方案，但您可以很容易地改变行为。

在Pandas / Python中向后插入多个连续的nan？

2 个答案: