假设我有这个数据框。
import pandas as pd
data = {"Date": ["2018-08-05", "2018-08-05", "2018-08-05", "2018-08-05", "2018-08-06"],
"Time_End":["2018-08-05 13:50:00", "2018-08-05 14:26:00", "2018-08-05 17:30:00", "2018-08-05 17:10:00", "2018-08-06 11:23:00"],
"Reason":["blah1", "blah2", "blah3", "blah4", "blah5"]
}
df = pd.DataFrame.from_dict(data)
df
Date Time_End Reason
0 2018-08-05 2018-08-05 13:50:00 blah1
1 2018-08-05 2018-08-05 14:26:00 blah2
2 2018-08-05 2018-08-05 17:30:00 blah3
3 2018-08-05 2018-08-05 17:10:00 blah4
4 2018-08-06 2018-08-06 11:23:00 blah5
我只想将“ Time_End”中的日期提取到名为“ Birth_date”的新列中。但是,我也想检查时间是否过了17:00。如果是这样,提取的日期将加一成为第二天。下面显示了所需的输出。
Date Birth_date Time_End Reason
0 2018-08-05 2018-08-05 2018-08-05 13:50:00 blah1
1 2018-08-05 2018-08-05 2018-08-05 14:26:00 blah2
2 2018-08-05 2018-08-06 2018-08-05 17:30:00 blah3
3 2018-08-05 2018-08-06 2018-08-05 17:10:00 blah4
4 2018-08-06 2018-08-06 2018-08-06 11:23:00 blah5
我想到了这一点,但它并没有达到我的预期。
df["after_17"] = df["Time_End"].dt.hour > 17
df["birth_date"] = df["after_17"].map(lambda x: df["Time_End"].dt.date if x else df["Time_End"].dt.date + pd.DateOffset(1))
它将输出连接在一起,并形成一行。我如何使其正常工作?我也欢迎其他解决方案。
答案 0 :(得分:4)
使用timedelta
库中的datetime
方法向Time_End
添加7小时,然后使用dt.date
仅提取日期部分。
import pandas as pd
from datetime import timedelta
data = {"Date": ["2018-08-05", "2018-08-05", "2018-08-05", "2018-08-05", "2018-08-06"],
"Time_End":["2018-08-05 13:50:00", "2018-08-05 14:26:00", "2018-08-05 17:30:00", "2018-08-05 17:10:00", "2018-08-06 11:23:00"],
"Reason":["blah1", "blah2", "blah3", "blah4", "blah5"]
}
df = pd.DataFrame.from_dict(data).astype({'Time_End': 'datetime64'})
td = timedelta(hours=7)
df['Birth_Date'] = (df.Time_End + td).dt.date
输出
Date Time_End Reason Birth_Date
0 2018-08-05 2018-08-05 13:50:00 blah1 2018-08-05
1 2018-08-05 2018-08-05 14:26:00 blah2 2018-08-05
2 2018-08-05 2018-08-05 17:30:00 blah3 2018-08-06
3 2018-08-05 2018-08-05 17:10:00 blah4 2018-08-06
4 2018-08-06 2018-08-06 11:23:00 blah5 2018-08-06
答案 1 :(得分:2)
首先创建1天的DateOffset:
date_offset = pd.tseries.offsets.DateOffset(n=1)
df['Birth_date'] = df.Time_End.apply(lambda x: x + date_offset if x.hour >= 17 else x).dt.date
答案 2 :(得分:1)
您需要:
import numpy as np
import datetime as dt
import pandas as pd
data = {"Date": ["2018-08-05", "2018-08-05", "2018-08-05", "2018-08-05", "2018-08-06"],
"Time_End":["2018-08-05 13:50:00", "2018-08-05 14:26:00", "2018-08-05 17:30:00", "2018-08-05 17:10:00", "2018-08-06 11:23:00"],
"Reason":["blah1", "blah2", "blah3", "blah4", "blah5"]
}
df = pd.DataFrame(data)
# Convert column into pandas datetime format
df['Time_End'] = pd.to_datetime(df["Time_End"])
# Create a threshold value to compare
t = pd.to_datetime('17:00:00').time()
# Use datetime.timedelta to add a day for condition
df['Birth_date'] = np.where(df['Time_End'].dt.time < t, df['Time_End'], df["Time_End"] + dt.timedelta(days=1) )
输出:
Date Time_End Reason birthdate
0 2018-08-05 2018-08-05 13:50:00 blah1 2018-08-05 13:50:00
1 2018-08-05 2018-08-05 14:26:00 blah2 2018-08-05 14:26:00
2 2018-08-05 2018-08-05 17:30:00 blah3 2018-08-06 17:30:00
3 2018-08-05 2018-08-05 17:10:00 blah4 2018-08-06 17:10:00
4 2018-08-06 2018-08-06 11:23:00 blah5 2018-08-06 11:23:00
答案 3 :(得分:0)
您可以先拆分列,然后再进行比较以添加到日期:
df[['Birth-date', 'Time']] = df['Time_End'].str.split(' ', n=1, expand=True)