Question

我的数据框如下

    Name Age
0    Tom  20
1   nick  21
2           
3  krish  19
4   jack  18
5           
6   jill  26
7   nick

所需的输出是

    Name Age
0    Tom  20
1   nick  21
3  krish  19
4   jack  18
6   jill  26
7   nick

不应该更改索引，如果可能的话，如果我不必将空字符串转换为NaN，那就更好了。仅当所有列都包含''个空字符串时，才应将其删除

Answer 1

您可以这样做：

# df.eq('') compare every cell of `df` to `''`
# .all(1) or .all(axis=1) checks if all cells on rows are True
# ~ is negate operator.
mask = ~df.eq('').all(1)

# equivalently, `ne` for `not equal`, 
# mask = df.ne('').any(axis=1)

# mask is a boolean series of same length with `df`
# this is called boolean indexing, similar to numpy's
# which chooses only rows corresponding to `True`
df = df[mask]

或一行：

df = df[~df.eq('').all(1)]

Answer 2

如果它们是NaN，我们可以做dropna或我们replace空到NaN

df.mask(df.eq('')).dropna(thresh=1)
Out[151]: 
    Name  Age
0    Tom   20
1   nick   21
3  krish   19
4   jack   18
6   jill   26
7   nick  NaN

Answer 3

空字符串实际上被解释为False，因此删除只有空字符串的行就像保留至少一个字段不为空的行一样简单（即解释为True）：

df[df.any(axis=1)]

或不久

df[df.any(1)]

如果熊猫数据框的所有列都为空，则删除行

3 个答案: