在Dataframe中构造具有异构时区的列并对其进行本地化

时间:2016-11-09 11:37:01

标签: datetime pandas dataframe timezone pytz

问题:
我正在尝试使用日期,时间和时区列创建DateTime列。时区是异构的。随后将新创建的DateTime列中的时间转换为本地TimeZone。下面我用示例代码解释了我的用例。

用例和示例代码:
我有两个pandas DataFrame,即df1df2
df1在本地TimeZone中为不同国家/地区的学校提供​​开放/关闭时间,并提供各自的时区。< / p>

>>> df1 = pd.DataFrame({'School': {0: 'ABC', 2: 'GHI', 3: 'JKL', 4: 'MNO'}, 'OpenTime': {0: '08:00:00.000', 2: '10:00:23.563', 3: '09:30.05.908', 4: '07:15:50.100'}, 'CloseTime': {0: '13:00:00.000', 2: '13:30:00.100', 3: '15:00.00.768', 4: '13:00:00.500'}, 'TimeZone':{0:'Europe/Vienna', 2:'Europe/London', 3:'Pacific/Auckland', 4:'Asia/Seoul'}})
>>> df1
      CloseTime      OpenTime School          TimeZone
0  13:00:00.000  08:00:00.000    ABC     Europe/Vienna
2  13:30:00.100  10:00:23.563    GHI     Europe/London
3  15:00.00.768  09:30.05.908    JKL  Pacific/Auckland
4  13:00:00.500  07:15:50.100    MNO        Asia/Seoul

df2只是一个带有大量日期的中间数据框。

 >>> df2 = pd.DataFrame({'Dates': {0: '2016-11-02', 1: '2015-03-31', 2: '2015-10-30', 3: '2001-09-01'}})
>>> df2
        Dates
0  2016-11-02
1  2015-03-31
2  2015-10-30
3  2001-09-01

df3只是df1df2的笛卡尔积,如下所示:

>>> df1['key'], df2['key'] =0,0
>>> df3 = df1.merge(df2, how='left', on='key')
>>> df3
       CloseTime      OpenTime School          TimeZone  key       Dates
0   13:00:00.000  08:00:00.000    ABC     Europe/Vienna    0  2016-11-02
1   13:00:00.000  08:00:00.000    ABC     Europe/Vienna    0  2016-05-02
2   13:00:00.000  08:00:00.000    ABC     Europe/Vienna    0  2015-03-31
...

现在我想找出在df3上执行以下两个步骤的最佳方法:1)使用{{1}将OpenTimeCloseTime列转换为DateTime }和Dates信息。
2)将TimeZoneOpenTime转换为'Europe / London'TimeZone。

1 个答案:

答案 0 :(得分:0)

我认为您首先需要to_datetime然后apply才能转换为不同的时区。上次使用tz_localize首先重置为UTC,然后本地化为Europe/London

df1 = pd.DataFrame({'School': {0: 'ABC', 2: 'GHI', 3: 'JKL', 4: 'MNO'}, 'OpenTime': {0: '08:00:00.000', 2: '10:00:23.563', 3: '09:30:05.908', 4: '07:15:50.100'}, 'CloseTime': {0: '13:00:00.000', 2: '13:30:00.100', 3: '15:00:00.768', 4: '13:00:00.500'}, 'TimeZone':{0:'Europe/Vienna', 2:'Europe/London', 3:'Pacific/Auckland', 4:'Asia/Seoul'}})
df2 = pd.DataFrame({'Dates': {0: '2016-11-02', 1: '2015-03-31', 2: '2015-10-30', 3: '2001-09-01'}})

df1['key'], df2['key'] = 0,0
df3 = df1.merge(df2, how='left', on='key')
print (df3)

df3['Close'] = pd.to_datetime(df3.Dates +' '+ df3.CloseTime) 
df3['Open'] = pd.to_datetime(df3.Dates +' '+ df3.OpenTime) 
df3['CloseFin'] = df3.apply(lambda x: x.Close.tz_localize(x.TimeZone), axis=1)
df3['OpenFin'] = df3.apply(lambda x: x.Open.tz_localize(x.TimeZone), axis=1)


df3['OpenEULondon'] = df3['Open'].dt.tz_localize('UTC').dt.tz_convert('Europe/London')
df3['CloseEULondon'] = df3['Close'].dt.tz_localize('UTC').dt.tz_convert('Europe/London')
print (df3) 
...
...
                            CloseFin                           OpenFin  \
0          2016-11-02 13:00:00+01:00         2016-11-02 08:00:00+01:00   
1          2015-03-31 13:00:00+02:00         2015-03-31 08:00:00+02:00   
2          2015-10-30 13:00:00+01:00         2015-10-30 08:00:00+01:00   
3          2001-09-01 13:00:00+02:00         2001-09-01 08:00:00+02:00   
4   2016-11-02 13:30:00.100000+00:00  2016-11-02 10:00:23.563000+00:00   
5   2015-03-31 13:30:00.100000+01:00  2015-03-31 10:00:23.563000+01:00   
6   2015-10-30 13:30:00.100000+00:00  2015-10-30 10:00:23.563000+00:00   
7   2001-09-01 13:30:00.100000+01:00  2001-09-01 10:00:23.563000+01:00   
8   2016-11-02 15:00:00.768000+13:00  2016-11-02 09:30:05.908000+13:00   
9   2015-03-31 15:00:00.768000+13:00  2015-03-31 09:30:05.908000+13:00   
10  2015-10-30 15:00:00.768000+13:00  2015-10-30 09:30:05.908000+13:00   
11  2001-09-01 15:00:00.768000+12:00  2001-09-01 09:30:05.908000+12:00   
12  2016-11-02 13:00:00.500000+09:00  2016-11-02 07:15:50.100000+09:00   
13  2015-03-31 13:00:00.500000+09:00  2015-03-31 07:15:50.100000+09:00   
14  2015-10-30 13:00:00.500000+09:00  2015-10-30 07:15:50.100000+09:00   
15  2001-09-01 13:00:00.500000+09:00  2001-09-01 07:15:50.100000+09:00   

                       OpenEULondon                    CloseEULondon  
0         2016-11-02 08:00:00+00:00        2016-11-02 13:00:00+00:00  
1         2015-03-31 09:00:00+01:00        2015-03-31 14:00:00+01:00  
2         2015-10-30 08:00:00+00:00        2015-10-30 13:00:00+00:00  
3         2001-09-01 09:00:00+01:00        2001-09-01 14:00:00+01:00  
4  2016-11-02 10:00:23.563000+00:00 2016-11-02 13:30:00.100000+00:00  
5  2015-03-31 11:00:23.563000+01:00 2015-03-31 14:30:00.100000+01:00  
6  2015-10-30 10:00:23.563000+00:00 2015-10-30 13:30:00.100000+00:00  
7  2001-09-01 11:00:23.563000+01:00 2001-09-01 14:30:00.100000+01:00  
8  2016-11-02 09:30:05.908000+00:00 2016-11-02 15:00:00.768000+00:00  
9  2015-03-31 10:30:05.908000+01:00 2015-03-31 16:00:00.768000+01:00  
10 2015-10-30 09:30:05.908000+00:00 2015-10-30 15:00:00.768000+00:00  
11 2001-09-01 10:30:05.908000+01:00 2001-09-01 16:00:00.768000+01:00  
12 2016-11-02 07:15:50.100000+00:00 2016-11-02 13:00:00.500000+00:00  
13 2015-03-31 08:15:50.100000+01:00 2015-03-31 14:00:00.500000+01:00  
14 2015-10-30 07:15:50.100000+00:00 2015-10-30 13:00:00.500000+00:00  
15 2001-09-01 08:15:50.100000+01:00 2001-09-01 14:00:00.500000+01:00  

dtypes列和CloseFin OpenFinobjectsee

print (df3.dtypes)
CloseTime                               object
OpenTime                                object
School                                  object
TimeZone                                object
key                                      int64
Dates                                   object
Close                           datetime64[ns]
Open                            datetime64[ns]
CloseFin                                object
OpenFin                                 object
OpenEULondon     datetime64[ns, Europe/London]
CloseEULondon    datetime64[ns, Europe/London]
dtype: object