我有一个名为“df_no_missing”的数据集。
df_no_missing.head()
TIMESTAMP object P_ACT_KW float64 PERIODE_TARIF object P_SOUSCR float64 SITE object TARIF object depassement float64 dtype: object
我尝试将日期和时间从时间戳列中提取到两个不同的列中,所以我做了:
dt = datetime.strptime('TIMESTAMP', '%d/%m/%y %H:%M')
df_no_missing['date'] = df_no_missing['TIMESTAMP'].dt.date
df_no_missing['time'] = df_no_missing['TIMESTAMP'].dt.time
但我收到了一个错误:
> ValueError Traceback (most recent call
> last) <ipython-input-185-6599284ba17f> in <module>()
> 1 print(df_no_missing.dtypes)
> 2 df_no_missing.head()
> ----> 3 dt = datetime.strptime('TIMESTAMP', '%d/%m/%y %H:%M')
> 4 df_no_missing['date'] = df_no_missing['TIMESTAMP'].dt.date
> 5 df_no_missing['time'] = df_no_missing['TIMESTAMP'].dt.time
>
> C:\Users\Demonstrator\Anaconda3\lib\_strptime.py in
> _strptime_datetime(cls, data_string, format)
> 508 """Return a class cls instance based on the input string and the
> 509 format string."""
> --> 510 tt, fraction = _strptime(data_string, format)
> 511 tzname, gmtoff = tt[-2:]
> 512 args = tt[:6] + (fraction,)
>
> C:\Users\Demonstrator\Anaconda3\lib\_strptime.py in
> _strptime(data_string, format)
> 341 if not found:
> 342 raise ValueError("time data %r does not match format %r" %
> --> 343 (data_string, format))
> 344 if len(data_string) != found.end():
> 345 raise ValueError("unconverted data remains: %s" %
>
> ValueError: time data 'TIMESTAMP' does not match format '%d/%m/%y
> %H:%M'
这是csv文件:
TIMESTAMP;P_ACT_KW;PERIODE_TARIF;P_SOUSCR;SITE;TARIF
31/07/2015 23:00;12;HC;;ST GEREON;TURPE_HTA5
31/07/2015 23:10;466;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:20;18;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:30;17;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:40;13;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:50;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:00;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:10;14;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:20;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:30;20;HC;425;ST GEREON;TURPE_HTA5
有什么好主意帮我吗?
提前谢谢
最佳回归
答案 0 :(得分:0)
你想要的IIUC:
df_no_missing['TIMESTAMP'] = pd.to_datetime(df_no_missin['TIMESTAMP'], '%d/%m/%y %H:%M')
然后您可以在转化后执行.dt.time
和dt.date
您还需要发布日期时间字符串的内容
修改强>
您可以告诉read_csv
在加载时解析您的约会时间:
In [42]:
import pandas as pd
import io
t="""TIMESTAMP;P_ACT_KW;PERIODE_TARIF;P_SOUSCR;SITE;TARIF
31/07/2015 23:00;12;HC;;ST GEREON;TURPE_HTA5
31/07/2015 23:10;466;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:20;18;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:30;17;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:40;13;HC;425;ST GEREON;TURPE_HTA5
31/07/2015 23:50;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:00;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:10;14;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:20;13;HC;425;ST GEREON;TURPE_HTA5
01/08/2015 00:30;20;HC;425;ST GEREON;TURPE_HTA5"""
df = pd.read_csv(io.StringIO(t), sep=';', parse_dates=[0])
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 6 columns):
TIMESTAMP 10 non-null datetime64[ns]
P_ACT_KW 10 non-null int64
PERIODE_TARIF 10 non-null object
P_SOUSCR 9 non-null float64
SITE 10 non-null object
TARIF 10 non-null object
dtypes: datetime64[ns](1), float64(1), int64(1), object(3)
memory usage: 560.0+ bytes
所以在你的情况下:
df = pd.read_csv(your_file, sep=';', parse_dates=[0])
应该正常工作
答案 1 :(得分:0)
如果你定义
dt = datetime.strptime(TIMESTAMP, '%d/%m/%y %H:%M')
然后TIMESTAMP
的值必须像
TIMESTAMP = '03/08/16 16:49'
如果格式定义为
dt = datetime.strptime(TIMESTAMP, '%d/%m/%Y %H:%M')
然后
TIMESTAMP = '03/08/2016 16:49'
应该是strptime
的可接受参数。