Applymap接口可用于多个(两个)列的操作

时间:2017-04-04 13:41:49

标签: python pandas dataframe

假设我有一个DataFrame:

df = pd.DataFrame({'DATE_1':['2010-11-06', '2010-10-07', '2010-09-07', '2010-05-07'],
                       'DATE_2':['2010-12-07', '2010-11-06', '2010-10-07', '2010-08-06']})
df['DATE_1'] = pd.to_datetime(df['DATE_1'])
df['DATE_2'] = pd.to_datetime(df['DATE_2'])

所以它看起来像:

      DATE_1      DATE_2
0   2010-11-06  2010-12-07
1   2010-10-07  2010-11-06
2   2010-09-07  2010-10-07
3   2010-05-07  2010-08-06

我想创建另一列DIFF,它在几天或几个月或几年内与DATE_2DATE_1不同。
我希望拥有这样的界面,在这些词之下,因为我必须创建很多列,类似从很多 DIFF列到DATE_X

def date_diffrence(x, y, parameter):
    if !np.isnan(x):
         return (x-y)
df['DIFF'] = df.apply(date_diffrence(df['DATE_2'], df['DATE_1']))

根据这篇文章:Difference between map, applymap and apply methods in Pandas,在我看来,我无法创建这样的通用界面。我是对的吗?

1 个答案:

答案 0 :(得分:1)

您似乎需要apply Series的功能dfdef date_diffrence_days(x, y): return (x-y).dt.days df['DIFF'] = date_diffrence_days(df['DATE_2'], df['DATE_1']) print (df) DATE_1 DATE_2 DIFF 0 2010-11-06 2010-12-07 31 1 2010-10-07 2010-11-06 30 2 2010-09-07 2010-10-07 30 3 2010-05-07 2010-08-06 91 列)作为dt.days的参数:

df['DIFF'] = (df['DATE_2'] - df['DATE_1']).dt.days
print (df)
      DATE_1     DATE_2  DIFF
0 2010-11-06 2010-12-07    31
1 2010-10-07 2010-11-06    30
2 2010-09-07 2010-10-07    30
3 2010-05-07 2010-08-06    91

与...相同:

def date_diffrence_days(x, y, parameter):
    if parameter == 'm':
        return (x-y).dt.days
    elif parameter == 's':
        return (x-y).dt.total_seconds()

df['DIFF'] = date_diffrence_days(df['DATE_2'], df['DATE_1'], 's')
print (df)
      DATE_1     DATE_2       DIFF
0 2010-11-06 2010-12-07  2678400.0
1 2010-10-07 2010-11-06  2592000.0
2 2010-09-07 2010-10-07  2592000.0
3 2010-05-07 2010-08-06  7862400.0

编辑:

git checkout origin/foo-bar