Question

我有5列 字符串，数字，字符串，字符串，数字。如何跳过字符串列而仅使用数字列？假设我不知道字符串列的索引。例如，我有csv文件 xyz，80，+ 40、34，-133，abc，151

我如何只获得 80、34、151 ？我尝试：

df.select_dtypes(include=['integer'])

然后：

df.select_dtypes(exclude=['str'])

但它仍然包含+ 40，-133

Answer 1

尝试使用pd.to_numeric（...，errors ='coerce'）方法：

>>> df
   0  1  blah  2
0  1  3     5  7
1  2  4     6  8


>>> cols = df.columns[pd.to_numeric(df.columns, errors='coerce').to_series().notnull()]

>>> df[cols]
   0  1  2
0  1  3  7
1  2  4  8

Answer 2

我尝试过

if True:
    df = pd.DataFrame({'a': [1, 2],'b': [23, 44],-12 : [-1.0, -2.0]})
    filter_data = df.apply(lambda x: pd.to_numeric(x, errors='coerce')).dropna(how='all', axis=1)
    print filter_data

结果：

  -12  a   b
0 -1.0  1  23
1 -2.0  2  44

然后：

if True:
    df = pd.DataFrame({'a': [1, 2],'b': [23, 44],-12 : [-1.0, -2.0]})
    data = df.select_dtypes(include=['integer'])
    filter_data = data.apply(lambda x: pd.to_numeric(x, errors='coerce')).dropna(how='all', axis=1)
    print filter_data

，并且工作正常：

 a   b
0  1  44
1  2  55

根据熊猫的数据类型过滤数据框

2 个答案: