根据当前条件有条件地更新先前的数据帧行

时间:2016-06-18 21:32:14

标签: python pandas dataframe

我有一个数据框(ev),我想阅读它,并且只要' trig' column是64,我需要更新上面4行的critical列的值,并将其更改为999.我尝试了下面的代码,但它没有改变任何东西,虽然它似乎应该可以工作。

for i in range(0,len(ev)):  
    if ev['trig'][i] == 64:
        ev['critical'][i-4] == 999

2 个答案:

答案 0 :(得分:0)

我认为你可以mask mask fillna False,因为shift NaN之后import pandas as pd ev = pd.DataFrame({'trig':[1,2,3,2,4,6,8,9,64,6,7,8,6,64], 'critical':[4,5,6,3,5,7,8,9,0,7,6,4,3,5]}) print (ev) critical trig 0 4 1 1 5 2 2 6 3 3 3 2 4 5 4 5 7 6 6 8 8 7 9 9 8 0 64 9 7 6 10 6 7 11 4 8 12 3 6 13 5 64

mask = (ev.trig == 64).shift(-4).fillna(False)
print (mask)
0     False
1     False
2     False
3     False
4      True
5     False
6     False
7     False
8     False
9      True
10    False
11    False
12    False
13    False
Name: trig, dtype: bool
ev['critical'] = ev.critical.mask(mask, 999)
print (ev)
    critical  trig
0          4     1
1          5     2
2          6     3
3          3     2
4        999     4
5          7     6
6          8     8
7          9     9
8          0    64
9        999     6
10         6     7
11         4     8
12         3     6
13         5    64
pandas

编辑:

<强>计时

我认为更好的是避免len(df)=1400中的迭代,因为在大型数据帧中它很慢:

In [66]: %timeit (jez(ev)) 1000 loops, best of 3: 1.29 ms per loop In [67]: %timeit (mer(ev1)) 10 loops, best of 3: 49.9 ms per loop

len(df)=14k

In [59]: %timeit (jez(ev)) 100 loops, best of 3: 2.49 ms per loop In [60]: %timeit (mer(ev1)) 1 loop, best of 3: 501 ms per loop

len(df)=140k

In [63]: %timeit (jez(ev)) 100 loops, best of 3: 15.8 ms per loop In [64]: %timeit (mer(ev1)) 1 loop, best of 3: 6.32 s per loop

import pandas as pd

ev = pd.DataFrame({'trig':[1,2,3,2,4,6,8,9,64,6,7,8,6,64],
                   'critical':[4,5,6,3,5,7,8,9,0,7,6,4,3,5]})

print (ev)
ev = pd.concat([ev]*100).reset_index(drop=True)
#ev = pd.concat([ev]*1000).reset_index(drop=True)
#ev = pd.concat([ev]*10000).reset_index(drop=True)

ev1 = ev.copy()

def jez(df):
    ev['critical'] = ev.critical.mask((ev.trig == 64).shift(-4).fillna(False), 999)
    return (ev)

def mer(df):
    for i in range(0,len(ev)):  
        if ev['trig'][i] == 64:
            ev['critical'][i-4] = 999
    return (ev)

print (jez(ev))    
print (mer(ev1))

时间安排的代码

{{1}}

答案 1 :(得分:0)

试试这个,你就近了:了解单"="对双"=="

 for i in range(0,len(ev)):  
        if ev['trig'][i] == 64:
            ev['critical'][i-4] = 999