在下一个应用迭代python中使用apply fnc的输出

时间:2017-11-10 05:49:42

标签: python pandas lambda apply

下面是1.起始df(称为" close"),以及2.一行代码及其生成的df:

1

Date    
2006-01-27  100.0
2006-01-30  100.0
2006-01-31  100.0
2006-02-01  100.0
2006-02-02  NaN
2006-02-03  NaN

2

close.apply(lambda x: x.shift(1) + (x.shift(4))

Date    
2006-01-27  NaN
2006-01-30  NaN
2006-01-31  NaN
2006-02-01  NaN
2006-02-02  100.706786
2006-02-03  NaN

我的预期输出是使用#2(100.706786)的输出,现有的df"关闭"计算序列中的下一个值,即2/03。该日期需要最后一个值(移位1),然后需要4个值(移位4或100)。

如何仅使用矢量化来完成此操作?我想避免循环因为它超级慢。这是我所拥有的那个:

closedf = pd.DataFrame()
for num,date in enumerate(close.index[4:]):
    widget = close.apply(lambda x: x.shift(1) + (x.shift(4)).iloc[num+4]
    closedf[date] = close.iloc[num+4] = widget

2 个答案:

答案 0 :(得分:4)

考虑一系列close

close = pd.Series(
    [100] * 3 + [100.706786] + [np.nan] * 10,
    pd.date_range('2006-01-27', periods=14, name='Date')
)

close

Date
2006-01-27    100.000000
2006-01-28    100.000000
2006-01-29    100.000000
2006-01-30    100.706786
2006-01-31           NaN
2006-02-01           NaN
2006-02-02           NaN
2006-02-03           NaN
2006-02-04           NaN
2006-02-05           NaN
2006-02-06           NaN
2006-02-07           NaN
2006-02-08           NaN
2006-02-09           NaN
Freq: D, dtype: float64

<强>解决方案
这是斐波那契序列的衍生物。据我所知,我们无法矢量化&#34; ......(w / e&#34; vectorize&#34;表示)

但我们可以创建一个执行任务的生成器

def shib(x1, x2, x3, x4):
    while True:
        x1, x2, x3, x4 = x2, x3, x4, x1 + x4
        yield x4

然后用它来分配新的变量

from itertools import islice

close.iloc[4:] = list(islice(shib(*close[:4]), 0, len(close) - 4))

close

Date
2006-01-27     100.000000
2006-01-28     100.000000
2006-01-29     100.000000
2006-01-30     100.706786
2006-01-31     200.706786
2006-02-01     300.706786
2006-02-02     400.706786
2006-02-03     501.413572
2006-02-04     702.120358
2006-02-05    1002.827144
2006-02-06    1403.533930
2006-02-07    1904.947502
2006-02-08    2607.067860
2006-02-09    3609.895004
Freq: D, dtype: float64

答案 1 :(得分:0)

我实际上找到了一个非常方便的解决方案(并且非常快)使用deque:

from collections import deque

queue = deque([100]*(4))    
close = []
for num in range(0,len(close.index-4):
    nextval = queue[-1] + queue[0]
    close.append(nextval)
    queue.popleft()
    queue.append(nextval)
close = pd.DataFrame(close,index=close.index)