假设我有一个DataFrame:
df = pd.DataFrame({'DATE_1':['2010-11-06', '2010-10-07', '2010-09-07', '2010-05-07'],
'DATE_2':['2010-12-07', '2010-11-06', '2010-10-07', '2010-08-06']})
df['DATE_1'] = pd.to_datetime(df['DATE_1'])
df['DATE_2'] = pd.to_datetime(df['DATE_2'])
所以它看起来像:
DATE_1 DATE_2
0 2010-11-06 2010-12-07
1 2010-10-07 2010-11-06
2 2010-09-07 2010-10-07
3 2010-05-07 2010-08-06
我想创建另一列DIFF
,它在几天或几个月或几年内与DATE_2
和DATE_1
不同。
我希望拥有这样的界面,在这些词之下,因为我必须创建很多列,类似从很多 DIFF
列到DATE_X
:
def date_diffrence(x, y, parameter):
if !np.isnan(x):
return (x-y)
df['DIFF'] = df.apply(date_diffrence(df['DATE_2'], df['DATE_1']))
根据这篇文章:Difference between map, applymap and apply methods in Pandas,在我看来,我无法创建这样的通用界面。我是对的吗?
答案 0 :(得分:1)
您似乎需要apply
Series
的功能df
(def date_diffrence_days(x, y):
return (x-y).dt.days
df['DIFF'] = date_diffrence_days(df['DATE_2'], df['DATE_1'])
print (df)
DATE_1 DATE_2 DIFF
0 2010-11-06 2010-12-07 31
1 2010-10-07 2010-11-06 30
2 2010-09-07 2010-10-07 30
3 2010-05-07 2010-08-06 91
列)作为dt.days
的参数:
df['DIFF'] = (df['DATE_2'] - df['DATE_1']).dt.days
print (df)
DATE_1 DATE_2 DIFF
0 2010-11-06 2010-12-07 31
1 2010-10-07 2010-11-06 30
2 2010-09-07 2010-10-07 30
3 2010-05-07 2010-08-06 91
与...相同:
def date_diffrence_days(x, y, parameter):
if parameter == 'm':
return (x-y).dt.days
elif parameter == 's':
return (x-y).dt.total_seconds()
df['DIFF'] = date_diffrence_days(df['DATE_2'], df['DATE_1'], 's')
print (df)
DATE_1 DATE_2 DIFF
0 2010-11-06 2010-12-07 2678400.0
1 2010-10-07 2010-11-06 2592000.0
2 2010-09-07 2010-10-07 2592000.0
3 2010-05-07 2010-08-06 7862400.0
编辑:
git checkout origin/foo-bar