从数据框中获取日期差异

时间:2017-03-14 01:08:13

标签: python python-2.7 python-3.x pandas datetime

我正在尝试根据另一个现有列

向数据框添加列

数据框采用以下格式

col1         col2 
2017-02-1    2017-03-03
2017-02-22   2017-03-06


from datetime import datetime
date_format = "%Y-%m-%d"
df['TimeConsumed']=df['col2'].apply(lambda x: (datetime.strptime(x,date_format)-datetime.strptime(df['col1'],date_format)).days)

运行以上内容并不断获取

TypeError: must be string, not Series 

有人请一点帮助吗?

1 个答案:

答案 0 :(得分:1)

发生错误是因为您尝试对系列进行strptime,只支持字符串

datetime.strptime(df['col1'], date_format)

我想你要在每一行上减去,然后你需要在行上应用,而不是在一列上,如下所示:

import pandas as pd
from datetime import datetime


def substract(df):
    date_format = "%Y-%m-%d"
    return (datetime.strptime(df['col2'],date_format)-    datetime.strptime(df['col1'],date_format)).days

if __name__ == '__main__':

    df = pd.DataFrame([{'col1':'2017-02-01','col2':'2017-03-03'},{'col1':'2017-02-22','col2':'2017-03-06'}])
    print df

    #date_format = "%Y-%m-%d"
    #df['TimeConsumed']=df['col2'].apply(lambda x: (datetime.strptime(x,date_format)-datetime.strptime(df['col1'],date_format)).days)
    df["TimeConsumed"] = df.apply(substract, axis=1)
    print df

输出:

        col1        col2  TimeConsumed
0  2017-02-01  2017-03-03            30
1  2017-02-22  2017-03-06            12
相关问题