Pandas删除重复的数据范围

时间:2017-08-01 16:05:11

标签: python python-3.x pandas dataframe

大家好我有以下数据框:

df1
      WL       WM      WH        WP      
1     low    medium   high   premium
2     26       26      15        14
3     32       32      18        29 
4     41       41      19        42
5     apple    dog     fur      napkins          
6     orange   cat     tesla    earphone
7     NaN      rat     tobias   controller
8     NaN      NaN     phone
9     low      medium  high            
10     1        3       5
11     2        4       6
12    low      medium  high
13     4        8       10
14     5        9       11

有没有办法删除低+ 2行,这样输出就是这样:

df1
      WL       WM      WH        WP      
1     low    medium   high   premium
2     26       26      15        14
3     32       32      18        29 
4     41       41      19        42
5     apple    dog     fur      napkins          
6     orange   cat     tesla    earphone
7     NaN      rat     tobias   controller
8     NaN      NaN     phone

不幸的是,代码必须是动态的,因为我有多个数据帧,并且“'低”'各有所不同。我最初的尝试:

df1 = df1[~df1.iloc[:,0].isin(['LOW'])+2].reset_index(drop=True)
然而,这并不是我想要的。任何帮助表示赞赏

1 个答案:

答案 0 :(得分:1)

您可以使用:

#get index values where low
a = df.index[df.iloc[:,0] == 'low']

size = 2
#all index values (without first [1:])
#min is for last rows of df for avoid select non existed values
arr = [np.arange(i, min(i+size+1,len(df)+1)) for i in a[1:]]
idx = np.unique(np.concatenate(arr))
print (idx)
[ 9 10 11 12 13 14]

#remove rows
df = df.drop(idx)
print (df)
       WL      WM      WH          WP
1     low  medium    high     premium
2      26      26      15          14
3      32      32      18          29
4      41      41      19          42
5   apple     dog     fur     napkins
6  orange     cat   tesla    earphone
7     NaN     rat  tobias  controller
8     NaN     NaN   phone         NaN