我有一个sas proc转置我试图在大熊猫中复制。
以下是一个例子:
ID = ['ID1', 'ID1', 'ID1', 'ID1', 'ID1']
obs_week = [201701,201701,201701,201701,201701]
weeks_id = [1,2,3,4,5]
spend = [100,200,300,400,500]
df = pd.DataFrame(zip(ID, obs_week, weeks_id, spend ), columns = ['id', 'obs_week', 'weeks_id', 'spend'])
df
这给出了一个这样的表:
id obs_week weeks_id spend
0 ID1 201701 1 100
1 ID1 201701 2 200
2 ID1 201701 3 300
3 ID1 201701 4 400
4 ID1 201701 5 500
我试图转置这个,以便ID1和obs_week变得唯一,然后week_id成为带有前缀的新列。
sas代码如下所示:
proc transpose data=spend out=spend_hh (drop = _label_ _name_) prefix=spend_;
by id obs_week;
id weeks_id;
var spend;
run;
我已经成功使用df.pivot_table
df.pivot_table(index=['id','obs_week'], columns='weeks_id', aggfunc=sum, fill_value=0)
给这样一张桌子
spend
weeks_id 1 2 3 4 5
id obs_week
ID1 201701 100 200 300 400 500
我的问题是我想将1 2 3 4 5重命名为花1,花2等等
我也想对文件中的多个不同变量执行此操作,但我假设我可以将选择限制为我想要的字段
我的回答应该是这样的:
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500
这只是以某种方式折叠标题吗?
我也希望id和obs_week不属于索引。
答案 0 :(得分:1)
您需要首先创建列名称列表理解,然后{/ 3}}表示索引列,reset_index
表示删除weeks_id
文字:
df = df.pivot_table(index=['id','obs_week'], columns='weeks_id', aggfunc=sum, fill_value=0)
df.columns = ['{}_{}'.format(x[0], x[1]) for x in df.columns]
df = df.reset_index().rename_axis(None, axis=1)
print (df)
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500
或者:
df.columns = ['_'.join((x[0], str(x[1]))) for x in df.columns]
df = df.reset_index().rename_axis(None, axis=1)
print (df)
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500
答案 1 :(得分:1)
这是一个单线
In [1446]: (df.pivot_table(index=['id', 'obs_week'], columns=['weeks_id'], values='spend')
.add_prefix('spend_')
.reset_index())
Out[1446]:
weeks_id id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500
或者,
In [1449]: (df.pivot_table(index=['id', 'obs_week'], columns=['weeks_id'], values='spend')
.add_prefix('spend_')
.reset_index()
.rename_axis(None, axis=1))
Out[1449]:
id obs_week spend_1 spend_2 spend_3 spend_4 spend_5
0 ID1 201701 100 200 300 400 500