将“日期”数据转换为 datetime64[ns] 类型

时间:2021-07-24 13:08:34

标签: python pandas datetime

上下文:
我想将“日期”转换为 float(),作为使用数据集进行训练的要求。

问题:
我想知道 Python 是否可以将“日期”数据转换为日期时间类型?

目标:
转换“2021 年 7 月 24 日”--->“07/24/2021”?


数据集:BTC历史数据

            Date        Close        High         Low        Open                 Volume (24H)   Market Cap
490 Dec 14, 2019    $7,091.76   $7,340.28   $7,040.29   $7,279.04   $17,075,801,948 69,010 BTC  $129,002,951,070
491 Dec 13, 2019    $7,279.04   $7,354.13   $7,192.74   $7,213.44   $16,667,772,107 71,176 BTC  $131,468,549,582
492 Dec 12, 2019    $7,214.58   $7,352.19   $7,127.09   $7,230.50   $18,895,200,531 102,171 BTC $131,200,636,979
493 Dec 11, 2019    $7,230.50   $7,312.27   $7,169.96   $7,242.22   $16,323,246,786 80,414 BTC  $130,567,148,332
494 Dec 10, 2019    $7,242.22   $7,409.36   $7,172.39   $7,362.61   $18,215,577,663 106,404 BTC $131,626,188,206
495 Dec 09, 2019    $7,362.61   $7,656.77   $7,309.09   $7,534.30   $17,847,629,948 122,066 BTC $133,889,762,913
496 Dec 08, 2019    $7,534.30   $7,702.15   $7,394.45   $7,510.99   $15,315,140,388 72,921 BTC  $136,960,305,336
497 Dec 07, 2019    $7,510.99   $7,699.64   $7,489.03   $7,549.93   $15,502,310,183 81,337 BTC  $136,521,384,515
498 Dec 06, 2019    $7,549.93   $7,615.61   $7,330.45   $7,400.13   $17,845,739,598 124,357 BTC $136,292,864,233
499 Dec 05, 2019    $7,400.13   $7,492.44   $7,175.62   $7,206.09   $18,880,551,089 154,696 BTC $134,769,681,329

这是代码(另一个上下文):

我的目标是清理数据,以满足float()

的标准

float() 的标准

  • 值不能包含空格
  • 值不能包含逗号
  • 值不得包含非特殊字符(即“inf”是特殊字符,但“fd”不是)

因此,我删除了数据集中的 "$""," 符号。

df = df.replace({'\$':''}, regex = True)
df = df.replace({'\,':''}, regex = True)

我试图将“日期”和“打开”列转换为 float()

df = df.astype({"Open": float})
df["Date"] = pd.to_datetime(df.Date, format="%m/%d/%Y")
df.dtypes

错误! :(

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    449             try:
--> 450                 values, tz = conversion.datetime_to_datetime64(arg)
    451                 dta = DatetimeArray(values, dtype=tz_to_dtype(tz))

pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()

TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
3 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/tools/datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    416                 try:
    417                     result, timezones = array_strptime(
--> 418                         arg, format, exact=exact, errors=errors
    419                     )
    420                     if "%Z" in format or "%z" in format:

pandas/_libs/tslibs/strptime.pyx in pandas._libs.tslibs.strptime.array_strptime()

ValueError: time data 'Jul 24 2021' does not match format '%m/%d/%Y' (match)

1 个答案:

答案 0 :(得分:1)

您在 pd.to_datetime 中指定了错误的格式

df['Date'] = pd.to_datetime(df['Date'], format='%b %d, %Y')

https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior

之后使用 dt.strftime 获取您想要的任何格式。上面链接中的占位符同样适用。