如何解析日期时间

时间:2020-08-07 19:20:45

标签: python pandas datetime parsing

我正在尝试解析日期时间,数据集如下所示;

数据

    Date    sell_B  buy_B
0   2016-01-03 22:00:01.446 1.0873  1.0875
1   2016-01-03 22:00:01.799 1.08714 1.08748
2   2016-01-03 22:00:01.981 1.08702 1.08748
3   2016-01-03 22:00:04.548 1.0870600000000001  1.0875
4   2016-01-03 22:00:07.478 1.08705 1.08749
5   2016-01-03 22:00:30.293 1.08704 1.08748
6   2016-01-03 22:00:34.876 1.08704 1.0874700000000002
7   2016-01-03 22:00:41.479 1.08714 1.0874700000000002
8   2016-01-03 22:00:44.739 1.08714 1.08746
9   2016-01-03 22:00:44.789 1.08704 1.08746

所有代码如下所示;

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import glob
test = pd.read_csv("D:\DAT_ASCII_EURUSD_T_201612.csv", header=None, names=['Date', 'sell_A', 'buy_A', 'unknonwn'])
test.head()
pd.to_datetime(test.Date, format='%yy%dd%mm %HH%mm%SS%fff')

下面也显示了我得到的错误

错误

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-38-2251cf4871d4> in <module>
----> 1 pd.to_datetime(test.Date, format='%yy%dd%mm %HH%mm%SS%fff')

F:\anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
    726             result = arg.map(cache_array)
    727         else:
--> 728             values = convert_listlike(arg._values, format)
    729             result = arg._constructor(values, index=arg.index, name=arg.name)
    730     elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)):

F:\anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    398                 try:
    399                     result, timezones = array_strptime(
--> 400                         arg, format, exact=exact, errors=errors
    401                     )
    402                     if "%Z" in format or "%z" in format:

pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()

pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()

pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.TimeRE.compile()

F:\anaconda3\lib\re.py in compile(pattern, flags)
    232 def compile(pattern, flags=0):
    233     "Compile a regular expression pattern, returning a Pattern object."
--> 234     return _compile(pattern, flags)
    235 
    236 def purge():

F:\anaconda3\lib\re.py in _compile(pattern, flags)
    284     if not sre_compile.isstring(pattern):
    285         raise TypeError("first argument must be string or compiled pattern")
--> 286     p = sre_compile.compile(pattern, flags)
    287     if not (flags & DEBUG):
    288         if len(_cache) >= _MAXCACHE:

F:\anaconda3\lib\sre_compile.py in compile(p, flags)
    762     if isstring(p):
    763         pattern = p
--> 764         p = sre_parse.parse(p, flags)
    765     else:
    766         pattern = None

F:\anaconda3\lib\sre_parse.py in parse(str, flags, pattern)
    922 
    923     try:
--> 924         p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
    925     except Verbose:
    926         # the VERBOSE flag was switched on inside the pattern.  to be

F:\anaconda3\lib\sre_parse.py in _parse_sub(source, state, verbose, nested)
    418     while True:
    419         itemsappend(_parse(source, state, verbose, nested + 1,
--> 420                            not nested and not items))
    421         if not sourcematch("|"):
    422             break

F:\anaconda3\lib\sre_parse.py in _parse(source, state, verbose, nested, first)
    805                     group = state.opengroup(name)
    806                 except error as err:
--> 807                     raise source.error(err.msg, len(name) + 1) from None
    808             sub_verbose = ((verbose or (add_flags & SRE_FLAG_VERBOSE)) and
    809                            not (del_flags & SRE_FLAG_VERBOSE))

error: redefinition of group name 'm' as group 5; was group 3 at position 113

问题

我该如何排序?

2 个答案:

答案 0 :(得分:1)

使用日期时间格式-%Y-%d-%m %H:%M:%S.%f。另外,您可以在parse_dates

中使用read_csv参数
In [6]: import pandas as pd

In [7]: df = pd.read_csv("a.csv", parse_dates=["Date"])

In [8]: df.dtypes
Out[8]:
Date      datetime64[ns]
sell_B           float64
buy_B            float64
dtype: object

答案 1 :(得分:0)

这两种方法都应该起作用:

df["Date"] = df["Date"].apply(lambda x: datetime.strptime(x, '%Y-%d-%m %H:%M:%S.%f'))
#df["Date"] = pd.to_datetime(df["Date"], format='%Y-%d-%m %H:%M:%S.%f')

print(df["Date"].dtype)
#datetime64[ns]