大熊猫日期栏减法

时间:2016-02-21 12:22:10

标签: python datetime pandas

我有像这样的pandas数据框..

       created_time  reached_time
2016-01-02 12:57:44      14:20:22
2016-01-02 12:57:44      13:01:38
2016-01-03 10:38:51      12:24:07
2016-01-03 10:38:51      12:32:11
2016-01-03 10:38:52      12:23:20
2016-01-03 10:38:52      12:51:34
2016-01-03 10:38:52      12:53:33
2016-01-03 10:38:52      13:04:08
2016-01-03 10:38:52      13:13:40

我想减去这两个日期列,并希望得到time

我正在python中执行以下操作

speed['created_time'].dt.time - speed['reached_time']

但它给了我以下错误

TypeError: ufunc subtract cannot use operands with types dtype('O') and dtype('<m8[ns]')

created_time的数据类型为objectreached_type的数据类型为timedelta64[ns]

2 个答案:

答案 0 :(得分:3)

您可以下载到NumPy阵列并在那里执行datetime/timedelta arithmetic。首先,创建一个dtype datetime64[D]的日期数组:

dates = speed['created_time'].values.astype('datetime64[D]')

然后您有两个选择:您可以将reached_time转换为日期,并从日期中减去日期:

speed['reached_date'] = dates + speed['reached_time'].values
speed['diff'] = speed['created_time'] - speed['reached_date']

或者您可以将created_time转换为timedeltas,并从timedeltas中减去timedeltas:

speed['created_delta'] = speed['created_time'].values - dates
speed['diff'] = speed['created_delta'] - speed['reached_time']
import pandas as pd

speed = pd.DataFrame(
    {'created_time': 
     ['2016-01-02 12:57:44', '2016-01-02 12:57:44', '2016-01-03 10:38:51',
      '2016-01-03 10:38:51', '2016-01-03 10:38:52', '2016-01-03 10:38:52',
      '2016-01-03 10:38:52', '2016-01-03 10:38:52', '2016-01-03 10:38:52'],
     'reached_time': 
     ['14:20:22', '13:01:38', '12:24:07', '12:32:11', '12:23:20', 
      '12:51:34', '12:53:33', '13:04:08', '13:13:40']})
speed['reached_time'] = pd.to_timedelta(speed['reached_time'])
speed['created_time'] = pd.to_datetime(speed['created_time'])

dates = speed['created_time'].values.astype('datetime64[D]')

speed['reached_date'] = dates + speed['reached_time'].values
speed['diff'] = speed['created_time'] - speed['reached_date']

# alternatively
# speed['created_delta'] = speed['created_time'].values - dates
# speed['diff'] = speed['created_delta'] - speed['reached_time']

print(speed)

产量

         created_time  reached_time        reached_date              diff
0 2016-01-02 12:57:44      14:20:22 2016-01-02 14:20:22 -1 days +22:37:22
1 2016-01-02 12:57:44      13:01:38 2016-01-02 13:01:38 -1 days +23:56:06
2 2016-01-03 10:38:51      12:24:07 2016-01-03 12:24:07 -1 days +22:14:44
3 2016-01-03 10:38:51      12:32:11 2016-01-03 12:32:11 -1 days +22:06:40
4 2016-01-03 10:38:52      12:23:20 2016-01-03 12:23:20 -1 days +22:15:32
5 2016-01-03 10:38:52      12:51:34 2016-01-03 12:51:34 -1 days +21:47:18
6 2016-01-03 10:38:52      12:53:33 2016-01-03 12:53:33 -1 days +21:45:19
7 2016-01-03 10:38:52      13:04:08 2016-01-03 13:04:08 -1 days +21:34:44
8 2016-01-03 10:38:52      13:13:40 2016-01-03 13:13:40 -1 days +21:25:12

使用HRYR's improvement,您可以在不下载到NumPy数组的情况下进行计算(即无需访问.values):

dates = speed['created_time'].dt.normalize()
speed['reached_date'] = dates + speed['reached_time']
speed['diff'] = speed['created_time'] - speed['reached_date']

答案 1 :(得分:2)

首先将Public Sub RadGrid_NeedDataSource(ByVal source As RadGrid, ByVal e As GridNeedDataSourceEventArgs) 'Handles RadGrid.NeedDataSource Dim RadGrid As RadGrid = CType(source, RadGrid) 'Dim nestedItem As GridNestedViewItem = CType(source.NamingContainer, GridNestedViewItem) 'Dim CustomerID = CType(nestedItem.ParentItem, GridDataItem).GetDataKeyValue(source.Attributes("TableID")) Dim gridSortString As String = RadGrid.MasterTableView.SortExpressions.GetSortString() Dim args As New DataSourceSelectArguments(gridSortString) If gridSortString Is Nothing Then RadGrid.DataSource = GetDataTable("SELECT * FROM [" + source.Attributes("TableName") + "] ") 'Where CustomerID = N'" + CustomerID + "' Else RadGrid.DataSource = GetDataTable("SELECT * FROM [" + source.Attributes("TableName") + "] ORDER BY " & gridSortString) 'Where CustomerID = N'" + CustomerID + "' End If End Sub 列转换为日期时间:

created_time

然后使用df["created_time"] = pd.to_datetime(df["created_time"]) 将时间部分设为df["created_time"] - df["created_time"].dt.normalize()类型。