我的数据框如下
Name Age
0 Tom 20
1 nick 21
2
3 krish 19
4 jack 18
5
6 jill 26
7 nick
所需的输出是
Name Age
0 Tom 20
1 nick 21
3 krish 19
4 jack 18
6 jill 26
7 nick
不应该更改索引,如果可能的话,如果我不必将空字符串转换为NaN,那就更好了。仅当所有列都包含''
个空字符串时,才应将其删除
答案 0 :(得分:4)
您可以这样做:
# df.eq('') compare every cell of `df` to `''`
# .all(1) or .all(axis=1) checks if all cells on rows are True
# ~ is negate operator.
mask = ~df.eq('').all(1)
# equivalently, `ne` for `not equal`,
# mask = df.ne('').any(axis=1)
# mask is a boolean series of same length with `df`
# this is called boolean indexing, similar to numpy's
# which chooses only rows corresponding to `True`
df = df[mask]
或一行:
df = df[~df.eq('').all(1)]
答案 1 :(得分:2)
如果它们是NaN
,我们可以做dropna
或我们replace
空到NaN
df.mask(df.eq('')).dropna(thresh=1)
Out[151]:
Name Age
0 Tom 20
1 nick 21
3 krish 19
4 jack 18
6 jill 26
7 nick NaN
答案 2 :(得分:1)
空字符串实际上被解释为False
,因此删除只有空字符串的行就像保留至少一个字段不为空的行一样简单(即解释为True
):>
df[df.any(axis=1)]
或不久
df[df.any(1)]