我有一个数据框(ev
),我想阅读它,并且只要' trig' column是64,我需要更新上面4行的critical
列的值,并将其更改为999.我尝试了下面的代码,但它没有改变任何东西,虽然它似乎应该可以工作。
for i in range(0,len(ev)):
if ev['trig'][i] == 64:
ev['critical'][i-4] == 999
答案 0 :(得分:0)
我认为你可以mask
mask
fillna
False
,因为shift
NaN
之后import pandas as pd
ev = pd.DataFrame({'trig':[1,2,3,2,4,6,8,9,64,6,7,8,6,64],
'critical':[4,5,6,3,5,7,8,9,0,7,6,4,3,5]})
print (ev)
critical trig
0 4 1
1 5 2
2 6 3
3 3 2
4 5 4
5 7 6
6 8 8
7 9 9
8 0 64
9 7 6
10 6 7
11 4 8
12 3 6
13 5 64
:
mask = (ev.trig == 64).shift(-4).fillna(False)
print (mask)
0 False
1 False
2 False
3 False
4 True
5 False
6 False
7 False
8 False
9 True
10 False
11 False
12 False
13 False
Name: trig, dtype: bool
ev['critical'] = ev.critical.mask(mask, 999)
print (ev)
critical trig
0 4 1
1 5 2
2 6 3
3 3 2
4 999 4
5 7 6
6 8 8
7 9 9
8 0 64
9 999 6
10 6 7
11 4 8
12 3 6
13 5 64
pandas
编辑:
<强>计时强>:
我认为更好的是避免len(df)=1400
中的迭代,因为在大型数据帧中它很慢:
In [66]: %timeit (jez(ev))
1000 loops, best of 3: 1.29 ms per loop
In [67]: %timeit (mer(ev1))
10 loops, best of 3: 49.9 ms per loop
:
len(df)=14k
In [59]: %timeit (jez(ev))
100 loops, best of 3: 2.49 ms per loop
In [60]: %timeit (mer(ev1))
1 loop, best of 3: 501 ms per loop
:
len(df)=140k
In [63]: %timeit (jez(ev))
100 loops, best of 3: 15.8 ms per loop
In [64]: %timeit (mer(ev1))
1 loop, best of 3: 6.32 s per loop
:
import pandas as pd
ev = pd.DataFrame({'trig':[1,2,3,2,4,6,8,9,64,6,7,8,6,64],
'critical':[4,5,6,3,5,7,8,9,0,7,6,4,3,5]})
print (ev)
ev = pd.concat([ev]*100).reset_index(drop=True)
#ev = pd.concat([ev]*1000).reset_index(drop=True)
#ev = pd.concat([ev]*10000).reset_index(drop=True)
ev1 = ev.copy()
def jez(df):
ev['critical'] = ev.critical.mask((ev.trig == 64).shift(-4).fillna(False), 999)
return (ev)
def mer(df):
for i in range(0,len(ev)):
if ev['trig'][i] == 64:
ev['critical'][i-4] = 999
return (ev)
print (jez(ev))
print (mer(ev1))
时间安排的代码:
{{1}}
答案 1 :(得分:0)
试试这个,你就近了:了解单"="
对双"=="
for i in range(0,len(ev)):
if ev['trig'][i] == 64:
ev['critical'][i-4] = 999