我正在尝试从rows
中删除所有pandas df
。具体来说,当row
中X
下的Col A
为空时。因此,如果row
中X
下的Col A
为空,我想删除所有这些行,直到值string
下有X
import pandas as pd
d = ({
'A' : ['X','','','X','Foo','','X','Fou','','X','Bar'],
'B' : ['Val',1,3,'Val',1,3,'Val',1,3,'Val',1],
'C' : ['Val',2,4,'Val',2,4,'Val',2,4,'Val',2],
})
df = pd.DataFrame(data=d)
输出:
A B C
0 X Val Val
1 1 2
2 3 4
3 X Val Val
4 Foo 1 2
5 3 4
6 X Val Val
7 Fou 1 2
8 3 4
9 X Val Val
10 Bar 1 2
我尝试过:
df = df[~(df['A'] == 'X').shift().fillna(False)]
但这会删除X后面的所有内容。我只希望在X下的下一行为空的情况下将其删除。
预期:
A B C
0 X Val Val
1 Foo 1 2
2 3 4
3 X Val Val
4 Fou 1 2
5 4 4
6 X Val Val
7 Bar 1 2
答案 0 :(得分:1)
使用:
m1 = df['A'] == 'X'
g = m1.cumsum()
m = (df['A'] == '') | m1
df = df[~m.groupby(g).transform('all')]
print (df)
A B C
3 X Val Val
4 Foo 1 2
5 3 4
6 X Val Val
7 Fou 1 2
8 3 4
9 X Val Val
10 Bar 1 2
详细信息:
m1 = df['A'] == 'X'
g = m1.cumsum()
m = (df['A'] == '') | m1
print (pd.concat([df,
df['A'] == 'X',
m1.cumsum(),
(df['A'] == ''),
m,
m.groupby(g).transform('all'),
~m.groupby(g).transform('all')], axis=1,
keys=['orig','==X','g','==space','m', 'all', 'inverted all']))
orig ==X g ==space m all inverted all
A B C A A A A A A
0 X Val Val True 1 False True True False
1 1 2 False 1 True True True False
2 3 4 False 1 True True True False
3 X Val Val True 2 False True False True
4 Foo 1 2 False 2 False False False True
5 3 4 False 2 True True False True
6 X Val Val True 3 False True False True
7 Fou 1 2 False 3 False False False True
8 3 4 False 3 True True False True
9 X Val Val True 4 False True False True
10 Bar 1 2 False 4 False False False True
说明:
X
进行比较,并为从X
到g
的组创建累积总和。X
和空白区域与m
进行比较groupby
与transform
和DataFrameGroupBy.all
,对于仅具有True
的组的返回True
s boolean indexing
过滤答案 1 :(得分:0)
这是您的解决方案:
let headers = {
'Content-Type': 'application/json;charset=utf-8'
};
if(token !== '') {
headers['TOKEN'] = token
}
return this.http.post(uri, data, {headers})
.then(this.extractData)
.catch(this.handleError);
结果是:
(df['A'] == 'X').shift()
0 NaN
1 True
2 False
3 False
4 True
5 False
6 False
7 True
8 False
9 False
10 True
Name: A, dtype: object
In [15]:
(df['A'] == '')
Out[15]:
0 False
1 True
2 True
3 False
4 False
5 True
6 False
7 False
8 True
9 False
10 False
Name: A, dtype: bool
In [14]:
((df['A'] == '') & (df['A'] == 'X').shift())
Out[14]:
0 False
1 True
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
10 False
Name: A, dtype: bool
编辑: 如果需要,可以在while循环中进行。 old_size_df = df.size new_size_df = 0
df[~((df['A'] == '') & (df['A'] == 'X').shift())]
Out[16]:
A B C
0 X Val Val
2 3 4
3 X Val Val
4 Foo 1 2
5 3 4
6 X Val Val
7 Fou 1 2
8 3 4
9 X Val Val
10 Bar 1 2
答案 2 :(得分:0)
这是具有自定义套用功能的解决方案:
d = ({
'A' : ['X','','','X','Foo','','X','Fou','','X','Bar'],
'B' : ['Val',1,3,'Val',1,3,'Val',1,3,'Val',1],
'C' : ['Val',2,4,'Val',2,4,'Val',2,4,'Val',2],
})
df = pd.DataFrame(data=d)
is_x = False
def fill_empty_a(row):
global is_x
if row['A'] == '' and is_x:
row['A'] = None
else:
is_x = row['A'] == 'X'
return row
(df.apply(fill_empty_a, axis=1)
.dropna()
.reset_index(drop=True))
# A B C
# 0 X Val Val
# 1 X Val Val
# 2 Foo 1 2
# 3 3 4
# 4 X Val Val
# 5 Fou 1 2
# 6 3 4
# 7 X Val Val
# 8 Bar 1 2