时间戳转换为datetime Python,Pandas

时间:2015-12-15 14:17:29

标签: python datetime pandas timestamp

所以,我正在尝试开发一个个人的股票筛选工具,但是在尝试将一列时间戳转换为可读的日期时间格式时,我一直得到“年度超出范围”错误...我将成为在数千个CSV上迭代此代码。从理论上讲,我可以在以后处理这个日期问题,但是现在我无法让它工作的事实非常烦人。

下面提交的代码是我正在使用的函数的大部分内容。它将导航到文件位置,检查文件是否为空,然后开始处理它。

我确信有更优雅的方法可以导航到目录并获取目标文件,但我目前只关心无法转换时间戳。

当时间戳在一个系列中时,我已经看到了这个问题的解决方案, 即;

dates =['1449866579','1449866580','1449866699'...]

我似乎无法让解决方案适用于数据帧。

这是CSV文件的示例:

1449866579,113.2100,113.2700,113.1600,113.2550,92800
1449866580,113.1312,113.2200,113.0700,113.2200,135800
1449866699,113.1150,113.1500,113.0668,113.1300,106000
1449866700,113.1800,113.2000,113.1200,113.1200,125800
1449866764,113.1200,113.1800,113.0700,113.1490,130900
1449866821,113.0510,113.1223,113.0500,113.1200,110400
1449866884,113.1000,113.1400,113.0100,113.0800,388000
1449866999,113.0900,113.1200,113.0700,113.0900,116700
1449867000,113.2000,113.2100,113.0770,113.1000,191500
1449867119,113.2250,113.2300,113.1400,113.2000,114400
1449867120,113.1300,113.2500,113.1000,113.2300,146700
1449867239,113.1300,113.1800,113.1250,113.1300,108300
1449867299,113.0930,113.1300,113.0700,113.1300,166600
1449867304,113.0850,113.1100,113.0300,113.1000,167000
1449867360,113.0300,113.1100,113.0200,113.0800,204300
1449867479,113.0700,113.0800,113.0200,113.0300,197100
1449867480,113.1600,113.1700,113.0500,113.0700,270200
1449867540,113.1700,113.2900,113.1300,113.1500,3882400
1449867600,113.1800,113.1800,113.1800,113.1800,3500

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
import time
import os
def analysis():
    try:
        os.chdir(training_1d)
            for i in os.listdir(os.getcwd()):
                if i.endswith('.txt'):
                    if os.stat(i).st_size > 0:
                        print i+" is good for analysis..."
                        try:
                            df = pd.read_csv(i, header=None, names=['date', 'open', 'high', 'low', 'close', 'volume'])
                            print df.head()
                            print df.columns
                            df['date'] = pd.to_datetime(df['date'],unit='s')
                            print df.head()
                        except Exception, e:
                            print str(e),"Analysis Failed..."

                    elif os.stat(i).st_size == 0:
                        print i+" is an empty file"
                        continue
     except Exception, e:
         print str(e),"Something went wrong here...check: "+sys.last_traceback.tb_lineno

这是输出错误......

AAPL.txt is good for analysis...
       date     open     high      low     close  volume
    0  1449865921  113.090  113.180  113.090  113.1601   89300
    1  1449865985  113.080  113.110  113.030  113.0900   73100
    2  1449866041  113.250  113.280  113.050  113.0900  101800
    3  1449866100  113.240  113.305  113.205  113.2400  199900
    4  1449866219  113.255  113.300  113.190  113.2500   96700
    Index([u'date', u'open', u'high', u'low', u'close', u'volume'], dtype='object')

    year is out of range Analysis Failed...

非常感谢任何帮助......谢谢。

感谢EdChum,如评论中所述,以下替换提供了必要的缓解:

更换:

df['date'] = pd.to_datetime(df['date'],unit='s')

使用:

df['date'] = pd.to_datetime(df['date'].astype(int), unit='s')

2 个答案:

答案 0 :(得分:3)

我不清楚为什么你的日期列被解析为字符串,但要从dtype需要为时间的时间段创建日期时间,那么你的代码将起作用:

df['date'] = pd.to_datetime(df['date'].astype(int), unit='s')

关于你的数据我得到了:

In [83]:
pd.to_datetime(df[0], unit='s')

Out[83]:
0    2015-12-11 20:42:59
1    2015-12-11 20:43:00
2    2015-12-11 20:44:59
3    2015-12-11 20:45:00
4    2015-12-11 20:46:04
5    2015-12-11 20:47:01
6    2015-12-11 20:48:04
7    2015-12-11 20:49:59
8    2015-12-11 20:50:00
9    2015-12-11 20:51:59
10   2015-12-11 20:52:00
11   2015-12-11 20:53:59
12   2015-12-11 20:54:59
13   2015-12-11 20:55:04
14   2015-12-11 20:56:00
15   2015-12-11 20:57:59
16   2015-12-11 20:58:00
17   2015-12-11 20:59:00
18   2015-12-11 21:00:00
Name: 0, dtype: datetime64[ns]

答案 1 :(得分:-1)

替换此行:

df['date'] = pd.to_datetime(df['date'],unit='s')

用这个:

df['date'] = pd.to_datetime(int(df['date']),unit='s')

这会将纪元时间戳转换为python标准时间戳。