我的数据框包含两列X1和X2
第一件事: 在X2中,我的值为0和1,如果在X2中值为1(当此值从1变为零时),则在接下来的20行中应该为1而不是零。
例如:
X2=(0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
desired X2=(0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
第二件事::如果X1=0
和X2=1
,则从数据框中选择行,直到X2
的值保持为1
我尝试了这段代码,但只选择了一行。
df1=df[(df['X1'] == 0) & (df['X2'] ==1)]
答案 0 :(得分:1)
经过编辑,包括了两个部分:
# First Thing:
df['X2'] = df['X2'].replace({0: np.nan}).ffill(limit=20).fillna(0)
# Second Thing:
df.loc[(df['X1'] == 0) & (df['X2'] == 1), 'new X2'] = 1
df.loc[(df['X2'] == 0), 'new X2'] = 0
df['new X2'] = df['new X2'].ffill()
df.loc[df['new X2'] == 1] # Selected Rows
答案 1 :(得分:1)
您的数据帧不大,因此您可以轻松地使用循环来解决问题:
#first prog
index = 0
while index < df.shape[0]:
if index + 1 < df.shape[0] and df['X2'][index] == 1 and df['X2'][index + 1] == 0:
df.loc[index +1: index + 20,'X2'] = 1 #set 1 to next 20 rows
break;
index = index + 1
print(df)
#second prog assuming you have a column X1/X2
df['select'] = False
for index, row in df.iterrows():
if index > 0 and df['select'][index - 1] == True and row.X2 == 1:
df.loc[index, 'select'] = True
if row.X1 == 0 and row.X2 == 1:
df.loc[index, 'select'] = True
df = df[df['select'] == True].drop('select', axis=1)
print(df)
答案 2 :(得分:0)
这是使用numpy解决“第一件事”的方法。
import numpy as np
locs =np.where(df['X2'].diff() == -1)[0]
for loc in locs:
df.loc[slice(loc, loc+20), 'X2'] = 1