给定一个数据帧,如何添加一个从数据帧中的列派生的额外列,即
data = {'date': ['2016-01-01', '2016-01-01', '2016-01-02'],
'number': [10, 21, 20],
'location': ['CA', 'NY', 'NJ']
}
print pd.DataFrame(data)
location number date
0 CA 10 2016-01-01
1 NY 21 2016-01-01
2 NJ 20 2016-01-02
我想从location
和date
ieget日期生成一个额外的列,然后生成extra_column
的键值,其中键为date + i
,值为一些随机字符串。 i = random.randint(1,3)
location number date extra_column
0 CA 10 2016-01-01 {{2016-01-01, CA}, {2016-01-02, something}, {2016-01-03, something else}}
1 NY 21 2016-01-01 {{2016-01-01, NY}, {2016-01-02, someplace}}
2 NJ 20 2016-01-02 {{2016-01-02, NJ}, {2016-01-03, anything}}
答案 0 :(得分:1)
您可以使用当前列编写一个函数do操作,只需将列添加到DataFrame
即可。请参阅以下代码:
import pandas as pd
data = {'date': ['2016-01-01', '2016-01-01', '2016-01-02'],
'number': [10, 21, 20],
'location': ['CA', 'NY', 'NJ']
}
df = pd.DataFrame(data)
def somefunc(date, location):
# some code to generate extra column
date_vals = df['date'].values
loc_vals = df['location'].values
new_col_vals = somefunc(date_vals, loc_vals)
# add the column by doing the following
df['new_col'] = new_col_vals
希望它有所帮助。