如何在python中减去与不同时间间隔对应的pandas列?

时间:2016-04-17 23:37:45

标签: python python-2.7 date datetime pandas

如何从python中的相同csv中减去不同的时间间隔?

例如,如果我想从09:30:00 HIGH减去09:15:00 HIGH。

我尝试了几种不同的方式,但一直在努力。

这就是我的尝试。

 exm = pd.read_csv('exm')

a915 = exm.HIGH.at_time("09:15:00")
a930 = exm.HIGH.at_time("09:30:00")

exm.sub13 = a915 - a930

此外,

 sub13 = a915 - a930

另外,

a915 = exm.at_time("09:15:00")
a930 = exm.at_time("09:30:00")

exm.sub13 = a915 - a930

此外,

sub13 = a915 - a930

甚至不能让它拉出一个独立的列

感谢您的帮助!!!!

日期,时间,OPEN,HIGH,LOW,CLOSE,VOLUME 02/03 / 1997,09:04:00,3046.00,3048.50,3046.00,3047.50,505
02/03 / 1997,09:05:00,3047.00,3048.00,3046.00,3047.00,162
02/03 / 1997,09:06:00,3047.50,3048.00,3047.00,3047.50,98
02/03 / 1997,09:07:00,3047.50,3047.50,3047.00,3047.50,228
02/03 / 1997,09:08:00,3048.00,3048.00,3047.50,3048.00,136
02/03 / 1997,09:09:00,3048.00,3048.00,3046.50,3046.50,174
02/03 / 1997,09:10:00,3046.50,3046.50,3045.00,3045.00,134
02/03 / 1997,09:11:00,3045.50,3046.00,3044.00,3045.00,43
02/03 / 1997,09:12:00,3045.00,3045.50,3045.00,3045.00,214
02/03 / 1997,09:13:00,3045.50,3045.50,3045.50,3045.50,8
02/03 / 1997,09:14:00,3045.50,3046.00,3044.50,3044.50,152
02/03 / 1997,09:15:00,3044.00,3044.00,3042.50,3042.50,126
02/03 / 1997,09:16:00,3043.50,3043.50,3043.00,3043.00,128
02/03 / 1997,09:17:00,3042.50,3043.50,3042.50,3043.50,23
02/03 / 1997,09:18:00,3043.50,3044.50,3043.00,3044.00,51
02/03 / 1997,09:19:00,3044.50,3044.50,3043.00,3043.00,18
02/03 / 1997,09:20:00,3043.00,3045.00,3043.00,3045.00,23
02/03 / 1997,09:21:00,3045.00,3045.00,3044.50,3045.00,51
02/03 / 1997,09:22:00,3045.00,3045.00,3045.00,3045.00,47
02/03 / 1997,09:23:00,3045.50,3046.00,3045.00,3045.00,77
02/03 / 1997,09:24:00,3045.00,3045.00,3045.00,3045.00,131
02/03 / 1997,09:25:00,3044.50,3044.50,3043.50,3043.50,138
02/03 / 1997,09:26:00,3043.50,3043.50,3043.50,3043.50,6
02/03 / 1997,09:27:00,3043.50,3043.50,3043.00,3043.00,56
02/03 / 1997,09:28:00,3043.00,3044.00,3043.00,3044.00,32
02/03 / 1997,09:29:00,3044.50,3044.50,3044.50,3044.50,63
02/03 / 1997,09:30:00,3045.00,3045.00,3045.00,3045.00,28
02/03 / 1997,09:31:00,3045.00,3045.50,3045.00,3045.50,75
02/03 / 1997,09:32:00,3045.50,3045.50,3044.00,3044.00,54
02/03 / 1997,09:33:00,3043.50,3044.50,3043.50,3044.00,96
02/03 / 1997,09:34:00,3044.00,3044.50,3044.00,3044.50,27
02/03 / 1997,09:35:00,3044.50,3044.50,3043.50,3044.50,44
02/03 / 1997,09:36:00,3044.00,3044.00,3043.00,3043.00,61
02/03 / 1997,09:37:00,3043.50,3043.50,3043.50,3043.50,18
02/03 / 1997,09:38:00,3043.50,3045.00,3043.50,3045.00,156

2 个答案:

答案 0 :(得分:1)

您可以在strptime中使用datetime为您的时间生成日期时间对象,然后减去它们以获得差异。例如:

>>> import datetime
>>> t1=datetime.datetime.strptime('01/01/2016 20:00:00', "%d/%m/%Y %H:%M:%S")
>>> t2=datetime.datetime.strptime('01/01/2016 21:00:00', "%d/%m/%Y %H:%M:%S")

>>> t2-t1
datetime.timedelta(0, 3600)
>>> (t2-t1).seconds
3600

答案 1 :(得分:1)

我认为您可以先通过参数DATE将列TIMEdatetime转换为parse_dates,然后在{{3}中的新DATE_TIME列中设置索引}}:

import pandas as pd
import io

temp=u"""DATE,TIME,OPEN,HIGH,LOW,CLOSE,VOLUME
02/03/1997,09:04:00,3046.00,3048.50,3046.00,3047.50,505
02/03/1997,09:05:00,3047.00,3048.00,3046.00,3047.00,162
02/03/1997,09:06:00,3047.50,3048.00,3047.00,3047.50,98
02/03/1997,09:07:00,3047.50,3047.50,3047.00,3047.50,228
02/03/1997,09:08:00,3048.00,3048.00,3047.50,3048.00,136
02/03/1997,09:09:00,3048.00,3048.00,3046.50,3046.50,174
02/03/1997,09:10:00,3046.50,3046.50,3045.00,3045.00,134
02/03/1997,09:11:00,3045.50,3046.00,3044.00,3045.00,43
02/03/1997,09:12:00,3045.00,3045.50,3045.00,3045.00,214
02/03/1997,09:13:00,3045.50,3045.50,3045.50,3045.50,8
02/03/1997,09:14:00,3045.50,3046.00,3044.50,3044.50,152
02/03/1997,09:15:00,3044.00,3044.00,3042.50,3042.50,126
02/03/1997,09:16:00,3043.50,3043.50,3043.00,3043.00,128
02/03/1997,09:17:00,3042.50,3043.50,3042.50,3043.50,23
02/03/1997,09:18:00,3043.50,3044.50,3043.00,3044.00,51
02/03/1997,09:19:00,3044.50,3044.50,3043.00,3043.00,18
02/03/1997,09:20:00,3043.00,3045.00,3043.00,3045.00,23
02/03/1997,09:21:00,3045.00,3045.00,3044.50,3045.00,51
02/03/1997,09:22:00,3045.00,3045.00,3045.00,3045.00,47
02/03/1997,09:23:00,3045.50,3046.00,3045.00,3045.00,77
02/03/1997,09:24:00,3045.00,3045.00,3045.00,3045.00,131
02/03/1997,09:25:00,3044.50,3044.50,3043.50,3043.50,138
02/03/1997,09:26:00,3043.50,3043.50,3043.50,3043.50,6
02/03/1997,09:27:00,3043.50,3043.50,3043.00,3043.00,56
02/03/1997,09:28:00,3043.00,3044.00,3043.00,3044.00,32
02/03/1997,09:29:00,3044.50,3044.50,3044.50,3044.50,63
02/03/1997,09:30:00,3045.00,3045.00,3045.00,3045.00,28
02/03/1997,09:31:00,3045.00,3045.50,3045.00,3045.50,75"""
#after testing replace io.StringIO(temp) to filename
exm = pd.read_csv(io.StringIO(temp), parse_dates = [['DATE', 'TIME']], index_col=0)
print exm
                       OPEN    HIGH     LOW   CLOSE  VOLUME
DATE_TIME                                                  
1997-02-03 09:04:00  3046.0  3048.5  3046.0  3047.5     505
1997-02-03 09:05:00  3047.0  3048.0  3046.0  3047.0     162
1997-02-03 09:06:00  3047.5  3048.0  3047.0  3047.5      98
1997-02-03 09:07:00  3047.5  3047.5  3047.0  3047.5     228
1997-02-03 09:08:00  3048.0  3048.0  3047.5  3048.0     136
1997-02-03 09:09:00  3048.0  3048.0  3046.5  3046.5     174
1997-02-03 09:10:00  3046.5  3046.5  3045.0  3045.0     134
1997-02-03 09:11:00  3045.5  3046.0  3044.0  3045.0      43
1997-02-03 09:12:00  3045.0  3045.5  3045.0  3045.0     214
1997-02-03 09:13:00  3045.5  3045.5  3045.5  3045.5       8
1997-02-03 09:14:00  3045.5  3046.0  3044.5  3044.5     152
1997-02-03 09:15:00  3044.0  3044.0  3042.5  3042.5     126
1997-02-03 09:16:00  3043.5  3043.5  3043.0  3043.0     128
1997-02-03 09:17:00  3042.5  3043.5  3042.5  3043.5      23
1997-02-03 09:18:00  3043.5  3044.5  3043.0  3044.0      51
1997-02-03 09:19:00  3044.5  3044.5  3043.0  3043.0      18
1997-02-03 09:20:00  3043.0  3045.0  3043.0  3045.0      23
1997-02-03 09:21:00  3045.0  3045.0  3044.5  3045.0      51
1997-02-03 09:22:00  3045.0  3045.0  3045.0  3045.0      47
1997-02-03 09:23:00  3045.5  3046.0  3045.0  3045.0      77
1997-02-03 09:24:00  3045.0  3045.0  3045.0  3045.0     131
1997-02-03 09:25:00  3044.5  3044.5  3043.5  3043.5     138
1997-02-03 09:26:00  3043.5  3043.5  3043.5  3043.5       6
1997-02-03 09:27:00  3043.5  3043.5  3043.0  3043.0      56
1997-02-03 09:28:00  3043.0  3044.0  3043.0  3044.0      32
1997-02-03 09:29:00  3044.5  3044.5  3044.5  3044.5      63
1997-02-03 09:30:00  3045.0  3045.0  3045.0  3045.0      28
1997-02-03 09:31:00  3045.0  3045.5  3045.0  3045.5      75
a915 = exm.HIGH.at_time("09:15:00")
a930 = exm.HIGH.at_time("09:30:00")
print a915
DATE_TIME
1997-02-03 09:15:00    3044.0

print a930
DATE_TIME
1997-02-03 09:30:00    3045.0
Name: HIGH, dtype: float64

如果您需要减去Series(列),则需要indexes,因为您获得了NAN

print a915 - a930
DATE_TIME
1997-02-03 09:15:00   NaN
1997-02-03 09:30:00   NaN
Name: HIGH, dtype: float64

如果您只需要减去HIGH列中的值,请将Series(列)转换为numpy arrays read_csv

print a915.values - a930.values
[-1.]

但是,如果您需要添加新列sub13,则需要index更改Series a930 a915。然后,您可以减去值,并且输出在行中,索引为a915 - 1997-02-03 09:15:00。缺少其他值 - NaN

print a915
DATE_TIME
1997-02-03 09:15:00    3044.0
Name: HIGH, dtype: float64

print pd.Series(a930.values, index=a915.index)
DATE_TIME
1997-02-03 09:15:00    3045.0
dtype: float64

exm['sub13'] = a915 - pd.Series(a930.values, index=a915.index)
print exm
                       OPEN    HIGH     LOW   CLOSE  VOLUME  sub13
DATE_TIME                                                         
1997-02-03 09:04:00  3046.0  3048.5  3046.0  3047.5     505    NaN
1997-02-03 09:05:00  3047.0  3048.0  3046.0  3047.0     162    NaN
1997-02-03 09:06:00  3047.5  3048.0  3047.0  3047.5      98    NaN
1997-02-03 09:07:00  3047.5  3047.5  3047.0  3047.5     228    NaN
1997-02-03 09:08:00  3048.0  3048.0  3047.5  3048.0     136    NaN
1997-02-03 09:09:00  3048.0  3048.0  3046.5  3046.5     174    NaN
1997-02-03 09:10:00  3046.5  3046.5  3045.0  3045.0     134    NaN
1997-02-03 09:11:00  3045.5  3046.0  3044.0  3045.0      43    NaN
1997-02-03 09:12:00  3045.0  3045.5  3045.0  3045.0     214    NaN
1997-02-03 09:13:00  3045.5  3045.5  3045.5  3045.5       8    NaN
1997-02-03 09:14:00  3045.5  3046.0  3044.5  3044.5     152    NaN
1997-02-03 09:15:00  3044.0  3044.0  3042.5  3042.5     126   -1.0
1997-02-03 09:16:00  3043.5  3043.5  3043.0  3043.0     128    NaN
1997-02-03 09:17:00  3042.5  3043.5  3042.5  3043.5      23    NaN
1997-02-03 09:18:00  3043.5  3044.5  3043.0  3044.0      51    NaN
1997-02-03 09:19:00  3044.5  3044.5  3043.0  3043.0      18    NaN
1997-02-03 09:20:00  3043.0  3045.0  3043.0  3045.0      23    NaN
1997-02-03 09:21:00  3045.0  3045.0  3044.5  3045.0      51    NaN
1997-02-03 09:22:00  3045.0  3045.0  3045.0  3045.0      47    NaN
1997-02-03 09:23:00  3045.5  3046.0  3045.0  3045.0      77    NaN
1997-02-03 09:24:00  3045.0  3045.0  3045.0  3045.0     131    NaN
1997-02-03 09:25:00  3044.5  3044.5  3043.5  3043.5     138    NaN
1997-02-03 09:26:00  3043.5  3043.5  3043.5  3043.5       6    NaN
1997-02-03 09:27:00  3043.5  3043.5  3043.0  3043.0      56    NaN
1997-02-03 09:28:00  3043.0  3044.0  3043.0  3044.0      32    NaN
1997-02-03 09:29:00  3044.5  3044.5  3044.5  3044.5      63    NaN
1997-02-03 09:30:00  3045.0  3045.0  3045.0  3045.0      28    NaN
1997-02-03 09:31:00  3045.0  3045.5  3045.0  3045.5      75    NaN