我正在尝试解析日期时间,数据集如下所示;
Date sell_B buy_B
0 2016-01-03 22:00:01.446 1.0873 1.0875
1 2016-01-03 22:00:01.799 1.08714 1.08748
2 2016-01-03 22:00:01.981 1.08702 1.08748
3 2016-01-03 22:00:04.548 1.0870600000000001 1.0875
4 2016-01-03 22:00:07.478 1.08705 1.08749
5 2016-01-03 22:00:30.293 1.08704 1.08748
6 2016-01-03 22:00:34.876 1.08704 1.0874700000000002
7 2016-01-03 22:00:41.479 1.08714 1.0874700000000002
8 2016-01-03 22:00:44.739 1.08714 1.08746
9 2016-01-03 22:00:44.789 1.08704 1.08746
所有代码如下所示;
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import glob
test = pd.read_csv("D:\DAT_ASCII_EURUSD_T_201612.csv", header=None, names=['Date', 'sell_A', 'buy_A', 'unknonwn'])
test.head()
pd.to_datetime(test.Date, format='%yy%dd%mm %HH%mm%SS%fff')
下面也显示了我得到的错误
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-38-2251cf4871d4> in <module>
----> 1 pd.to_datetime(test.Date, format='%yy%dd%mm %HH%mm%SS%fff')
F:\anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
726 result = arg.map(cache_array)
727 else:
--> 728 values = convert_listlike(arg._values, format)
729 result = arg._constructor(values, index=arg.index, name=arg.name)
730 elif isinstance(arg, (ABCDataFrame, abc.MutableMapping)):
F:\anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
398 try:
399 result, timezones = array_strptime(
--> 400 arg, format, exact=exact, errors=errors
401 )
402 if "%Z" in format or "%z" in format:
pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()
pandas\_libs\tslibs\strptime.pyx in pandas._libs.tslibs.strptime.TimeRE.compile()
F:\anaconda3\lib\re.py in compile(pattern, flags)
232 def compile(pattern, flags=0):
233 "Compile a regular expression pattern, returning a Pattern object."
--> 234 return _compile(pattern, flags)
235
236 def purge():
F:\anaconda3\lib\re.py in _compile(pattern, flags)
284 if not sre_compile.isstring(pattern):
285 raise TypeError("first argument must be string or compiled pattern")
--> 286 p = sre_compile.compile(pattern, flags)
287 if not (flags & DEBUG):
288 if len(_cache) >= _MAXCACHE:
F:\anaconda3\lib\sre_compile.py in compile(p, flags)
762 if isstring(p):
763 pattern = p
--> 764 p = sre_parse.parse(p, flags)
765 else:
766 pattern = None
F:\anaconda3\lib\sre_parse.py in parse(str, flags, pattern)
922
923 try:
--> 924 p = _parse_sub(source, pattern, flags & SRE_FLAG_VERBOSE, 0)
925 except Verbose:
926 # the VERBOSE flag was switched on inside the pattern. to be
F:\anaconda3\lib\sre_parse.py in _parse_sub(source, state, verbose, nested)
418 while True:
419 itemsappend(_parse(source, state, verbose, nested + 1,
--> 420 not nested and not items))
421 if not sourcematch("|"):
422 break
F:\anaconda3\lib\sre_parse.py in _parse(source, state, verbose, nested, first)
805 group = state.opengroup(name)
806 except error as err:
--> 807 raise source.error(err.msg, len(name) + 1) from None
808 sub_verbose = ((verbose or (add_flags & SRE_FLAG_VERBOSE)) and
809 not (del_flags & SRE_FLAG_VERBOSE))
error: redefinition of group name 'm' as group 5; was group 3 at position 113
我该如何排序?
答案 0 :(得分:1)
使用日期时间格式-%Y-%d-%m %H:%M:%S.%f
。另外,您可以在parse_dates
read_csv
参数
In [6]: import pandas as pd
In [7]: df = pd.read_csv("a.csv", parse_dates=["Date"])
In [8]: df.dtypes
Out[8]:
Date datetime64[ns]
sell_B float64
buy_B float64
dtype: object
答案 1 :(得分:0)
这两种方法都应该起作用:
df["Date"] = df["Date"].apply(lambda x: datetime.strptime(x, '%Y-%d-%m %H:%M:%S.%f'))
#df["Date"] = pd.to_datetime(df["Date"], format='%Y-%d-%m %H:%M:%S.%f')
print(df["Date"].dtype)
#datetime64[ns]