Question

在我的数据集中，我有几行包含字符。我只需要包含所有整数的行。最好的方法是什么？以下数据集：例如，我想删除第2行和第3行，因为它们分别包含051A，04A和08B。

1   2017    0   321     3   20  42  18
2   051A    0   321     3   5   69  04A
3   460     0   1633    16  38  17  08B
4   1811    0   822     8   13  65  18

Answer 1

不确定是否可以在此处避免申请

df.apply(lambda x: pd.to_numeric(x, errors = 'coerce')).dropna()

    0   1   2   3   4   5   6   7
0   1   2017.0  0   321 3   20  42  18.0
3   4   1811.0  0   822 8   13  65  18.0

Answer 2

这与@ jpp的解决方案非常相似，但检查数字是否与技术不同。

df[df.applymap(lambda x: str(x).isdecimal()).all(1)].astype(int)

   0     1  2    3  4   5   6   7
0  1  2017  0  321  3  20  42  18
3  4  1811  0  822  8  13  65  18

感谢@jpp建议isdecimal而不是isdigit

Answer 3

作为其他好答案的替代方案，此解决方案使用Entry.unbind_all('<Key>') Entry.unbind_all('<KeyPress>') Entry.unbind_all('<KeyRelease>') + stack范例来避免循环解决方案。

unstack

Answer 4

对于此任务，如上所述，try / except是一个应该处理所有情况的解决方案。

pd.DataFrame.applymap将函数应用于数据框中的每个元素。

def CheckInt(s):
    try: 
        int(s)
        return True
    except ValueError:
        return False

res = df[df.applymap(CheckInt).all(axis=1)].astype(int)

#    0     1  2    3  4   5   6   7
# 0  1  2017  0  321  3  20  42  18
# 3  4  1811  0  822  8  13  65  18

Answer 5

在一行中，我认为您可以使用pandas中的convert_objects函数。有了这个，我们将对象转换为整数，这将导致NA。我们终于放弃了。

df = df.convert_objects(convert_numeric=True).dropna()

您可以在pandas documentation上查看更多信息。

Answer 6

我们假设您的DataFrame中最后一列的名称是Col

如果Col的类型不是字符串：

df['Col'] = df['Col'].apply(str)

然后一行只保留数字行：

df = df.loc[df['Col'].str.isnumeric()]

Python - Pandas Drop Row with strings

6 个答案: