通过地图向DataFrame添加3列

时间:2016-06-19 21:23:19

标签: python pandas dataframe

是否可以在一个地图中为这个小DataFrame添加3个新列?

import datetime as dt  
import pandas as pd
from pandas import *

df = pd.DataFrame({'myDate':['2006-02-12'
                             ,'2007-07-20'
                             ,'2009-05-19']})

def convert_date(val):    
    d, m, y = val.split('-')
    return int(d), int(y), int(m)

df[['day', 'year','month']] = df.myDate.map(convert_date)

2 个答案:

答案 0 :(得分:2)

您可以使用.str.split()

In [11]: df[['day', 'year','month']] = df.myDate.str.split('-', expand=True).astype(int)

In [12]: df
Out[12]:
       myDate   day  year  month
0  2006-02-12  2006     2     12
1  2007-07-20  2007     7     20
2  2009-05-19  2009     5     19

或使用.str.extract()

In [21]: df.myDate.str.extract(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})', expand=True).astype(int)
Out[21]:
   year  month  day
0  2006      2   12
1  2007      7   20
2  2009      5   19

答案 1 :(得分:2)

我认为您可以转换列myDate to_datetime,然后使用dt.yeardt.monthdt.day

df['myDate'] = pd.to_datetime(df.myDate)

df['year'] = df.myDate.dt.year
df['month'] = df.myDate.dt.month
df['day'] = df.myDate.dt.day

print (df)
     myDate  year  month  day
0 2006-02-12  2006      2   12
1 2007-07-20  2007      7   20
2 2009-05-19  2009      5   19

如果想要使用您的方法,则需要添加pd.Series,否则您将返回tuples。并将map更改为apply

def convert_date(val):    
    d, m, y = val.split('-')
    return pd.Series([int(d), int(y), int(m)])

df[['day', 'year','month']] = df.myDate.apply(convert_date)

print (df)
       myDate   day  year  month
0  2006-02-12  2006    12      2
1  2007-07-20  2007    20      7
2  2009-05-19  2009    19      5

我尝试使用map,但结果是:

def convert_date(val):    
    d, m, y = val.split('-')
    return int(d), int(y), int(m)

df['a'], df['b'], df['c'] = df.myDate.map(convert_date)
print (df)
       myDate     a     b     c
0  2006-02-12  2006  2007  2009
1  2007-07-20    12    20    19
2  2009-05-19     2     7     5