Cumsum在NaN重置

时间:2013-08-12 21:14:21

标签: python numpy pandas cumsum

如果我有一个pandas.core.series.Series名为ts的1或NaN就像这样:

3382   NaN
3381   NaN
...
3369   NaN
3368   NaN
...
15     1
10   NaN
11     1
12     1
13     1
9    NaN
8    NaN
7    NaN
6    NaN
3    NaN
4      1
5      1
2    NaN
1    NaN
0    NaN

我想计算这个系列的cumsum但是应该在NaN的位置重置(设置为零),如下所示:

3382   0
3381   0
...
3369   0
3368   0
...
15     1
10     0
11     1
12     2
13     3
9      0
8      0
7      0
6      0
3      0
4      1
5      2
2      0
1      0
0      0

理想情况下,我希望有一个矢量化解决方案!

我曾经看到过与Matlab类似的问题: Matlab cumsum reset at NaN?

但我不知道如何翻译这一行d = diff([0 c(n)]);

4 个答案:

答案 0 :(得分:10)

您的Matlab代码的简单Numpy翻译是这样的:

import numpy as np

v = np.array([1., 1., 1., np.nan, 1., 1., 1., 1., np.nan, 1.])
n = np.isnan(v)
a = ~n
c = np.cumsum(a)
d = np.diff(np.concatenate(([0.], c[n])))
v[n] = -d
np.cumsum(v)

执行此代码将返回结果array([ 1., 2., 3., 0., 1., 2., 3., 4., 0., 1.])。此解决方案仅与原始解决方案一样有效,但如果它不足以满足您的需求,它可能会帮助您提供更好的解决方案。

答案 1 :(得分:9)

这是一个稍微有点大熊猫的方式:

v = Series([1, 1, 1, nan, 1, 1, 1, 1, nan, 1], dtype=float)
n = v.isnull()
a = ~n
c = a.cumsum()
index = c[n].index  # need the index for reconstruction after the np.diff
d = Series(np.diff(np.hstack(([0.], c[n]))), index=index)
v[n] = -d
result = v.cumsum()

请注意,其中任何一项都要求您至少在9da899b或更新时使用pandas。如果您不是,那么您可以将bool dtype投放到int64float64 dtype

v = Series([1, 1, 1, nan, 1, 1, 1, 1, nan, 1], dtype=float)
n = v.isnull()
a = ~n
c = a.astype(float).cumsum()
index = c[n].index  # need the index for reconstruction after the np.diff
d = Series(np.diff(np.hstack(([0.], c[n]))), index=index)
v[n] = -d
result = v.cumsum()

答案 2 :(得分:9)

更多大熊猫的方式:

<!DOCTYPE html>
<html>
<head>
    <title>MediaSource API Demo</title>
</head>
<body>

<h3>Appending .webm video chunks using the Media Source API</h3>

<section>
    <video controls autoplay width="320" height="240"></video>
    <pre id="log"></pre>
</section>


<script>
    //ORIGINAL CODE http://html5-demos.appspot.com/static/media-source.html

    var FILE = 'IU_output2.webm';
    //    var FILE =  'test_movie_output.webm';

    var NUM_CHUNKS = 10;
    var video = document.querySelector('video');

    var mediaSource = new MediaSource();

    video.src = window.URL.createObjectURL(mediaSource);

    function callback(e) {
        var sourceBuffer = mediaSource.addSourceBuffer('video/webm; codecs="vorbis,vp8"');

        logger.log('mediaSource readyState: ' + this.readyState);

        GET(FILE, function(uInt8Array) {
            logger.log('byteLength:' + uInt8Array.byteLength );

            sourceBuffer.appendBuffer(uInt8Array);

        });
    }

    mediaSource.addEventListener('sourceopen', callback, false);
    //    mediaSource.addEventListener('webkitsourceopen', callback, false);
    //
    //    mediaSource.addEventListener('webkitsourceended', function(e) {
    //        logger.log('mediaSource readyState: ' + this.readyState);
    //    }, false);

    function GET(url, callback) {
        var xhr = new XMLHttpRequest();
        xhr.open('GET', url, true);
        xhr.responseType = 'arraybuffer';
        xhr.send();

        xhr.onload = function(e) {
            if (xhr.status != 200) {
                alert("Unexpected status code " + xhr.status + " for " + url);
                return false;
            }
            callback(new Uint8Array(xhr.response));
        };
    }
</script>
<script>
    function Logger(id) {
        this.el = document.getElementById('log');
    }
    Logger.prototype.log = function(msg) {
        var fragment = document.createDocumentFragment();
        fragment.appendChild(document.createTextNode(msg));
        fragment.appendChild(document.createElement('br'));
        this.el.appendChild(fragment);
    };

    Logger.prototype.clear = function() {
        this.el.textContent = '';
    };

    var logger = new Logger('log');
</script>
</body>
</html>

与matlab代码相反,这也适用于与1不同的值。

答案 3 :(得分:4)

如果您可以接受类似的布尔系列b,请尝试

(b.cumsum() - b.cumsum().where(~b).fillna(method='pad').fillna(0)).astype(int)

从系列ts开始,b = (ts == 1)b = ~ts.isnull()