如何根据条件替换基于先前值的列值,以及如何从数据框中选择行

时间:2019-03-22 07:32:26

标签: python-3.x pandas group-by

我的数据框包含两列X1和X2

第一件事: 在X2中,我的值为0和1,如果在X2中值为1(当此值从1变为零时),则在接下来的20行中应该为1而不是零。

例如:

X2=(0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)

desired X2=(0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)

第二件事::如果X1=0X2=1,则从数据框中选择行,直到X2的值保持为1 我尝试了这段代码,但只选择了一行。

df1=df[(df['X1'] == 0) & (df['X2'] ==1)]

3 个答案:

答案 0 :(得分:1)

经过编辑,包括了两个部分:

# First Thing:
df['X2'] = df['X2'].replace({0: np.nan}).ffill(limit=20).fillna(0)

# Second Thing:
df.loc[(df['X1'] == 0) & (df['X2'] == 1), 'new X2'] = 1
df.loc[(df['X2'] == 0), 'new X2'] = 0
df['new X2'] = df['new X2'].ffill()
df.loc[df['new X2'] == 1] # Selected Rows

答案 1 :(得分:1)

您的数据帧不大,因此您可以轻松地使用循环来解决问题:

#first prog
index = 0
while index < df.shape[0]:
    if index + 1 < df.shape[0] and df['X2'][index] == 1 and df['X2'][index + 1] == 0:
        df.loc[index +1: index + 20,'X2'] = 1            #set 1 to next 20 rows
        break;
    index = index + 1 

print(df)

#second prog assuming you have a column X1/X2
df['select'] = False
for index, row in df.iterrows():
    if index > 0 and df['select'][index - 1] == True and row.X2 == 1:
        df.loc[index, 'select'] = True
    if row.X1 == 0 and row.X2 == 1:
        df.loc[index, 'select'] = True

df = df[df['select'] == True].drop('select', axis=1) 

print(df)

答案 2 :(得分:0)

这是使用numpy解决“第一件事”的方法。

import numpy as np

locs =np.where(df['X2'].diff() == -1)[0]
for loc in locs:
    df.loc[slice(loc, loc+20), 'X2'] = 1