从Pandas数据帧列派生额外列

时间:2016-09-19 06:04:08

标签: python pandas

给定一个数据帧,如何添加一个从数据帧中的列派生的额外列,即

data = {'date': ['2016-01-01', '2016-01-01', '2016-01-02'],
        'number': [10, 21, 20],
        'location': ['CA', 'NY', 'NJ']
        }

print pd.DataFrame(data)

  location  number        date
0       CA      10  2016-01-01
1       NY      21  2016-01-01
2       NJ      20  2016-01-02

我想从locationdate ieget日期生成一个额外的列,然后生成extra_column的键值,其中键为date + i,值为一些随机字符串。 i = random.randint(1,3)

的位置
  location  number       date     extra_column
0       CA      10  2016-01-01    {{2016-01-01, CA}, {2016-01-02, something}, {2016-01-03, something else}}
1       NY      21  2016-01-01    {{2016-01-01, NY}, {2016-01-02, someplace}}
2       NJ      20  2016-01-02    {{2016-01-02, NJ}, {2016-01-03, anything}}

1 个答案:

答案 0 :(得分:1)

您可以使用当前列编写一个函数do操作,只需将列添加到DataFrame即可。请参阅以下代码:

import pandas as pd

data = {'date': ['2016-01-01', '2016-01-01', '2016-01-02'],
        'number': [10, 21, 20],
        'location': ['CA', 'NY', 'NJ']
        }

df = pd.DataFrame(data)

def somefunc(date, location):
    # some code to generate extra column


date_vals = df['date'].values
loc_vals = df['location'].values

new_col_vals = somefunc(date_vals, loc_vals)

# add the column by doing the following
df['new_col'] = new_col_vals

希望它有所帮助。