Question

我有一个带有createdOnTimeZone和startDate列的数据集。时区类似于-600，但startDate如2019-01-28T19：50：27.345-06：00。我想将时区应用于所有行中的startDate。我知道我必须在'。'上分割startDate。（我不需要毫秒，只需要几秒钟就可以了），使用strptime将其转换为日期时间，然后使用mktime将日期时间转换为时间戳。但是我不知道如何将其应用于startDate列中的所有行。

createdOnTimeZone startDate
-600              2019-01-28T19:50:27.345-06:00
-600              2019-01-28T19:50:35.493-06:00
-600              2019-01-28T19:50:38.947-06:00
-600              2019-01-28T19:50:49.048-06:00
-600              2019-01-28T19:50:59.600-06:00
-600              2019-01-28T19:51:08.267-06:00
-600              2019-01-28T19:51:15.899-06:00
-600              2019-01-28T19:51:27.326-06:00
-600              2019-01-28T19:51:38.762-06:00

Answer 1

尝试一下：

df['startDate'] = df['startDate'].apply(lambda x : x.split('.')[0])

这将用'。'分隔字符串。

例如。

2019-01-28T19:50:27.345-06:00变为2019-01-28T19:50:27

这是时间戳记

df['startDate'] = df['startDate'].apply(lambda x : time.mktime(datetime.datetime.strptime(x.split('.')[0], "%Y-%m-%dT%H:%M:%S").timetuple()))

Answer 2

尝试一下：

from datetime import datetime

a = "2019-01-28T19:50:27.345-06:00"

# Splitting the string and grabbing the zeroth index, which is the date.
date = a.split("T")[0]

# Similarly splitting and grabbing the time, removing the milliseconds.
time = a.split("T")[1].split(".")[0]

# Converted it to my desired datetime format and extracted timestamp out of it.
date_time = date + " " + time
time_stamp = datetime.strptime(date_time, "%Y-%m-%d %H:%M:%S").timestamp()

print(type(date_time), date_time)
print(type(time_stamp), time_stamp)

输出：

<class 'str'> 2019-01-28 19:50:27
<class 'float'> 1548726627.0

如何在熊猫数据框的列中拆分所有字符串？

2 个答案: