Pandas找到最小日期(NOT DATETIME)列

时间:2018-05-18 12:18:24

标签: python pandas datetime

我有一个pandas数据帧df,其中列Datetime是一个日期时间对象。如果我这样做:

df['Datetime'].min()

pandas会返回正确的答案 - 最早可用的日期。但是,如果我想使用日期对象而不是日期时间,并随后创建一个Date,它是一个datetime.date对象,如下所示:

df['Date'] = df['Datetime'].dt.date
df['Date'].min()

我回来了

TypeError: '<=' not supported between instances of 'float' and 'datetime.date'

这是一只熊猫虫吗?我该如何解决它?我使用的是python 3.6,pandas 0.20.3

1 个答案:

答案 0 :(得分:1)

在pandas 0.23.0中,如果没有NaT值,您的代码就会很好用:

rng = pd.date_range('2017-04-03 14:10:01', periods=10, freq='15H')
df = pd.DataFrame({'Datetime': rng, 'a': range(10)})  

df['Date'] = df['Datetime'].dt.date
print (df['Date'].min())

2017-04-03
rng = pd.date_range('2017-04-03 14:10:01', periods=10, freq='15H')
df = pd.DataFrame({'Datetime': rng, 'a': range(10)})  
df.loc[len(df), 'Date'] = np.nan
df['Date'] = df['Datetime'].dt.date

print (df)
              Datetime    a        Date
0  2017-04-03 14:10:01  0.0  2017-04-03
1  2017-04-04 05:10:01  1.0  2017-04-04
2  2017-04-04 20:10:01  2.0  2017-04-04
3  2017-04-05 11:10:01  3.0  2017-04-05
4  2017-04-06 02:10:01  4.0  2017-04-06
5  2017-04-06 17:10:01  5.0  2017-04-06
6  2017-04-07 08:10:01  6.0  2017-04-07
7  2017-04-07 23:10:01  7.0  2017-04-07
8  2017-04-08 14:10:01  8.0  2017-04-08
9  2017-04-09 05:10:01  9.0  2017-04-09
10                 NaT  NaN         NaT

print (df['Date'].min())
  

TypeError:unorderable类型:datetime.date()&lt; = float()

适用于NaT的解决方案:

#alternative
print (min(df['Date'].tolist()))
#print (min(df['Date'].values))

2017-04-03

另一种解决方案:

dayfloor代替datetime代替date

df['Date'] = df['Datetime'].dt.floor('d')

<强>示例

rng = pd.date_range('2017-04-03 14:10:01', periods=10, freq='15H')
df = pd.DataFrame({'Datetime': rng, 'a': range(10)})  

df['Date'] = df['Datetime'].dt.floor('d')
print (df)
             Datetime  a       Date
0 2017-04-03 14:10:01  0 2017-04-03
1 2017-04-04 05:10:01  1 2017-04-04
2 2017-04-04 20:10:01  2 2017-04-04
3 2017-04-05 11:10:01  3 2017-04-05
4 2017-04-06 02:10:01  4 2017-04-06
5 2017-04-06 17:10:01  5 2017-04-06
6 2017-04-07 08:10:01  6 2017-04-07
7 2017-04-07 23:10:01  7 2017-04-07
8 2017-04-08 14:10:01  8 2017-04-08
9 2017-04-09 05:10:01  9 2017-04-09

print (df['Datetime'].min())
2017-04-03 14:10:01

print (df['Datetime'].min().date())
2017-04-03