我正在尝试根据另一个现有列
向数据框添加列数据框采用以下格式
col1 col2
2017-02-1 2017-03-03
2017-02-22 2017-03-06
from datetime import datetime
date_format = "%Y-%m-%d"
df['TimeConsumed']=df['col2'].apply(lambda x: (datetime.strptime(x,date_format)-datetime.strptime(df['col1'],date_format)).days)
运行以上内容并不断获取
TypeError: must be string, not Series
有人请一点帮助吗?
答案 0 :(得分:1)
发生错误是因为您尝试对系列进行strptime,只支持字符串:
datetime.strptime(df['col1'], date_format)
我想你要在每一行上减去,然后你需要在行上应用,而不是在一列上,如下所示:
import pandas as pd
from datetime import datetime
def substract(df):
date_format = "%Y-%m-%d"
return (datetime.strptime(df['col2'],date_format)- datetime.strptime(df['col1'],date_format)).days
if __name__ == '__main__':
df = pd.DataFrame([{'col1':'2017-02-01','col2':'2017-03-03'},{'col1':'2017-02-22','col2':'2017-03-06'}])
print df
#date_format = "%Y-%m-%d"
#df['TimeConsumed']=df['col2'].apply(lambda x: (datetime.strptime(x,date_format)-datetime.strptime(df['col1'],date_format)).days)
df["TimeConsumed"] = df.apply(substract, axis=1)
print df
输出:
col1 col2 TimeConsumed
0 2017-02-01 2017-03-03 30
1 2017-02-22 2017-03-06 12