使用python将时间戳列拆分为两个单独的日期和时间列

时间:2016-08-02 12:56:54

标签: python pandas time timestamp

我有一个名为“df_no_missing”的数据集。

df_no_missing.head()
TIMESTAMP         object
P_ACT_KW         float64
PERIODE_TARIF     object
P_SOUSCR         float64
SITE              object
TARIF             object
depassement      float64
dtype: object

我尝试将日期和时间从时间戳列中提取到两个不同的列中,所以我做了:

dt = datetime.strptime('TIMESTAMP', '%d/%m/%y %H:%M')
df_no_missing['date'] = df_no_missing['TIMESTAMP'].dt.date
df_no_missing['time'] = df_no_missing['TIMESTAMP'].dt.time

但我收到了一个错误:

> ValueError                                Traceback (most recent call
> last) <ipython-input-185-6599284ba17f> in <module>()
>       1 print(df_no_missing.dtypes)
>       2 df_no_missing.head()
> ----> 3 dt = datetime.strptime('TIMESTAMP', '%d/%m/%y %H:%M')
>       4 df_no_missing['date'] = df_no_missing['TIMESTAMP'].dt.date
>       5 df_no_missing['time'] = df_no_missing['TIMESTAMP'].dt.time
> 
> C:\Users\Demonstrator\Anaconda3\lib\_strptime.py in
> _strptime_datetime(cls, data_string, format)
>     508     """Return a class cls instance based on the input string and the
>     509     format string."""
> --> 510     tt, fraction = _strptime(data_string, format)
>     511     tzname, gmtoff = tt[-2:]
>     512     args = tt[:6] + (fraction,)
> 
> C:\Users\Demonstrator\Anaconda3\lib\_strptime.py in
> _strptime(data_string, format)
>     341     if not found:
>     342         raise ValueError("time data %r does not match format %r" %
> --> 343                          (data_string, format))
>     344     if len(data_string) != found.end():
>     345         raise ValueError("unconverted data remains: %s" %
> 
> ValueError: time data 'TIMESTAMP' does not match format '%d/%m/%y
> %H:%M'

这是csv文件:

TIMESTAMP;P_ACT_KW;PERIODE_TARIF;P_SOUSCR;SITE;TARIF    
31/07/2015 23:00;12;HC;;ST GEREON;TURPE_HTA5
31/07/2015 23:10;466;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:20;18;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:30;17;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:40;13;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:50;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:00;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:10;14;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:20;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:30;20;HC;425;ST GEREON;TURPE_HTA5

有什么好主意帮我吗?

提前谢谢

最佳回归

2 个答案:

答案 0 :(得分:0)

你想要的IIUC:

df_no_missing['TIMESTAMP'] = pd.to_datetime(df_no_missin['TIMESTAMP'], '%d/%m/%y %H:%M')

然后您可以在转化后执行.dt.timedt.date

您还需要发布日期时间字符串的内容

修改

您可以告诉read_csv在加载时解析您的约会时间:

In [42]:
import pandas as pd
import io
t="""TIMESTAMP;P_ACT_KW;PERIODE_TARIF;P_SOUSCR;SITE;TARIF
31/07/2015 23:00;12;HC;;ST GEREON;TURPE_HTA5
31/07/2015 23:10;466;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:20;18;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:30;17;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:40;13;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:50;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:00;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:10;14;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:20;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:30;20;HC;425;ST GEREON;TURPE_HTA5"""
df = pd.read_csv(io.StringIO(t), sep=';', parse_dates=[0])
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 6 columns):
TIMESTAMP        10 non-null datetime64[ns]
P_ACT_KW         10 non-null int64
PERIODE_TARIF    10 non-null object
P_SOUSCR         9 non-null float64
SITE             10 non-null object
TARIF            10 non-null object
dtypes: datetime64[ns](1), float64(1), int64(1), object(3)
memory usage: 560.0+ bytes

所以在你的情况下:

df = pd.read_csv(your_file, sep=';', parse_dates=[0])

应该正常工作

答案 1 :(得分:0)

如果你定义

dt = datetime.strptime(TIMESTAMP, '%d/%m/%y %H:%M')

然后TIMESTAMP的值必须像

TIMESTAMP = '03/08/16 16:49'

如果格式定义为

dt = datetime.strptime(TIMESTAMP, '%d/%m/%Y %H:%M')

然后

TIMESTAMP = '03/08/2016 16:49'

应该是strptime的可接受参数。