Question

我有一个这样的数据框：

>>> d
Out[28]: 
                         A                     B      C      D       E
2017-06-08 20:39:00 1260.00  1903-08-12 00:00:00 230.00 245.00 19954.55
2017-06-08 20:40:00 1260.00                 1330 230.00 245.00 19966.51
2017-06-08 20:48:00 1260.00                 1320 230.00 240.00 19961.00
2017-06-08 21:02:00 1240.00                 1330 230.00 245.00 19951.38
2017-06-08 21:06:00 1240.00                 1340   5.00 240.00 19966.84
2017-06-08 21:07:00 1240.00                 1350 220.00 230.00 20000.24
2017-06-08 21:08:00 1250.00                 1370 220.00 230.00 20004.66
2017-06-11 20:31:00 1220.00                 1280 235.00 245.00 19913.86

我想删除datetime.datetime类型的所有值（A列除外）（这里是B列中的第一个）。我尝试了以下但没有工作（意图是将datetime转换为nan并稍后删除nan值）：

d[type(d)==pd.datetime]=np.nan

我还尝试了每个列，即以下内容：

df=d['B'].copy()
df[type(df)==pd.datetime]=np.nan

Answer 1

简单的boolean indexing是不够的。您需要检查每个项目的日期时间。

输入：

In [239]: df
Out[239]: 
                  Col1                 Col2
0  1903-08-12 00:00:00                    1
1                    1                  abc
2                    2                    2
3                 1234                 1234
4                  abc  1903-08-12 00:00:00

选项1

使用df.apply和pd.to_datetime，然后使用df.isnull和boolean indexing。使用df.dropna删除NaN行。

In [290]: df[df.apply(pd.to_datetime, errors='coerce').isnull()].dropna()
Out[290]: 
   Col1  Col2
1     1   abc
2     2     2
3  1234  1234

选项2

直接申请pd.datetime（不使用df.apply）：

In [57]: df[pd.to_datetime(df.stack(), 'coerce').unstack().isnull()].dropna()
Out[57]: 
   Col1  Col2
1     1   abc
2     2     2
3  1234  1234

选项3

使用df.mask（谢谢piRSquared！）

In [62]: df.mask(pd.to_datetime(df.stack(), 'coerce').notnull().unstack()).dropna()
Out[62]: 
   Col1  Col2
1     1   abc
2     2     2
3  1234  1234

选项4

您可以使用df.applymap

In [240]: df[~df.applymap(lambda x: isinstance(x, pd.datetime))].dropna()
Out[240]: 
   Col1  Col2
1     1   abc
2     2     2
3  1234  1234

Answer 2

我找到了另一个解决方案，不确定这是否是最好的解决方案。

df= d.iloc[:,1:].convert_objects(convert_dates=False,convert_numeric =True)
df.dropna()

删除包含特定类型的数据框行

2 个答案: