我目前有一个如下所示的数据框表:
Day_Part Start_Time End_Time
Quarter 1 2014, 1, 1 2015, 3, 1
Quarter 2 2014, 3, 3 2014, 7, 3
列“Start_Time”和“End_Time”是pandas系列。我想将两列的数据类型转换为datetime。
我需要这两列具有datetime数据类型,因为在后面的代码块中,我说如果此列在此日期和此日期之间,那么我将其标记为Quarter 1.
*非常感谢任何帮助
答案 0 :(得分:0)
您可以使用带有格式字符串的to_datetime()来提取日期:
date = pd.to_datetime(df.Start_Time, format='%Y, %m, %d').dt.date
您还可以修改日期:
df[['Start_Time', 'End_Time']] = df[['Start_Time', 'End_Time']].apply(
lambda x: pd.to_datetime(x, format='%Y, %m, %d').dt.date)
或者,您可以在阅读csv时转换为日期:
to_date = lambda x: pd.to_datetime(x, format='%Y, %m, %d').date()
converters = dict(Start_Time=to_date, End_Time=to_date)
df = pd.read_csv(StringIO(data), converters=converters)
一个可测试的例子:
import pandas as pd
from io import StringIO
data = u"""
Day_Part,Start_Time,End_Time
"Quarter 1","2014, 1, 1","2015, 3, 1"
"Quarter 2","2014, 3, 3","2014, 7, 3"
"""
df = pd.read_csv(StringIO(data))
# You can use `to_datetime()` with a format string to extract the date:
date = pd.to_datetime(df.Start_Time, format='%Y, %m, %d').dt.date
# The start month in the second row is 3
assert date[1].month == 3
# You can also modify in place
df[['Start_Time', 'End_Time']] = df[['Start_Time', 'End_Time']].apply(
lambda x: pd.to_datetime(x, format='%Y, %m, %d').dt.date)
# The end month in the second row is 7
assert df.End_Time[1].month == 7
# You can convert to date when reading the csv
to_date = lambda x: pd.to_datetime(x, format='%Y, %m, %d').date()
converters = dict(Start_Time=to_date, End_Time=to_date)
df = pd.read_csv(StringIO(data), converters=converters)
# The end month in the first row is 3
assert df.End_Time[0].month == 3