熊猫:构建一个自引用过去值的列

时间:2016-10-14 16:53:17

标签: python pandas vectorization

我需要生成一个以初始值开头的列,然后由包含该列的过去值的函数生成。例如

[InvalidArgumentException]
The target directory "web" does not exist.

assets:install [--symlink] [--relative] [-h|--help] [-q|--quiet] [-  v|vv|vvv|--verbose] [-V|--version] [--ansi] [--no-ansi] [-n|--no-interaction] [-s|--shell] [--process-isolation] [-e|--env ENV] [--no-debug] [--] <command> [<target>]

Content-type: text/html

Script Sensio\Bundle\DistributionBundle\Composer\ScriptHandler::installAssets handling the symfony-scripts event terminated with an exception

Installation failed, reverting ./composer.json to its original content.

[RuntimeException]
An error occurred when executing the "'assets:install --symlink --relative
  '\''web'\'''" command:
Content-type: text/html

[InvalidArgumentException]
The target directory "web" does not exist.

  assets:install [--symlink] [--relative] [-h|--help] [-q|--quiet] [-v|v
  v|vvv|--verbose] [-V|--version] [--ansi] [--no-ansi] [-n|--no-interaction]
  [-s|--shell] [--process-isolation] [-e|--env ENV] [--no-debug] [--]
  32m<command> [<target>]
 .

现在,我想要生成列的其余部分&#39; b&#39;通过采用前一行的最小值并添加两行。一种解决方案是

df = pd.DataFrame({'a': [1,1,5,2,7,8,16,16,16]})
df['b'] = 0
df.ix[0, 'b'] = 1
df

    a  b
0   1  1
1   1  0
2   5  0
3   2  0
4   7  0
5   8  0
6  16  0
7  16  0
8  16  0

产生所需的输出

for i in range(1, len(df)):
    df.ix[i, 'b'] = df.ix[i-1, :].min() + 2

大熊猫是否有清洁&#39;这样做的方法?最好是将计算矢量化的一个?

1 个答案:

答案 0 :(得分:5)

pandas没有很好的方法来处理一般的递归计算。可能有一些技巧可以对其进行矢量化,但是如果你可以采用依赖关系,那么使用numba这是相对轻松且非常快的。

@numba.njit
def make_b(a):
    b = np.zeros_like(a)
    b[0] = 1
    for i in range(1, len(a)):
        b[i] = min(b[i-1], a[i-1]) + 2

    return b

df['b'] = make_b(df['a'].values)

df
Out[73]: 
    a   b
0   1   1
1   1   3
2   5   3
3   2   5
4   7   4
5   8   6
6  16   8
7  16  10
8  16  12