在pandas / python中转置DataFrame,但不是所有列

时间:2017-10-13 16:13:19

标签: python pandas dataframe transactions

我正在使用下面的DataFrame。我尝试使用pivot转置它,但结果并不像我想要的那样。我想完成以下任务 -

df=pd.DataFrame({'ID_Patient':[11132,2755,9753,8453,4872],'Name_Patient':['Jim','Jack','Sue','Tom','James'],'Visits_Jan':[2,1,0,4,2],'Visits_Feb':[5,0,0,1,1],'Visits_Mar':[0,0,4,1,2]})
df=df[['ID_Patient','Name_Patient','Visits_Jan','Visits_Feb','Visits_Mar']] 

df#The data set I wish to convert
Out[318]: 
   ID_Patient Name_Patient  Visits_Jan  Visits_Feb  Visits_Mar
0       11132          Jim           2           5           5
1        2755         Jack           1           0           0
2        9753          Sue           0           0           0
3        8453          Tom           4           1           1
4        4872        James           2           1           1

我希望将其转换为:

df_altered
Out[317]: 
    ID_Patient Name_Patient Month_of_visit  Col1
0        11132          Jim     Visits_Jan     2
1        11132          Jim     Visits_Feb     5
2        11132          Jim     Visits_Mar     5
3         2755         Jack     Visits_Jan     1
4         2755         Jack     Visits_Feb     0
5         2755         Jack     Visits_Mar     0
6         9753          Sue     Visits_Jan     0
7         9753          Sue     Visits_Feb     0
8         9753          Sue     Visits_Mar     0
9         8453          Tom     Visits_Jan     4
10        8453          Tom     Visits_Feb     1
11        8453          Tom     Visits_Mar     1
12        4872        James     Visits_Jan     2
13        4872        James     Visits_Feb     1
14        4872        James     Visits_Mar     1

2 个答案:

答案 0 :(得分:3)

使用stack

df.set_index(['ID_Patient','Name_Patient']).stack().reset_index()
Out[254]: 
    ID_Patient Name_Patient     level_2  0
0        11132          Jim  Visits_Jan  2
1        11132          Jim  Visits_Feb  5
2        11132          Jim  Visits_Mar  0
3         2755         Jack  Visits_Jan  1
4         2755         Jack  Visits_Feb  0
5         2755         Jack  Visits_Mar  0
6         9753          Sue  Visits_Jan  0
7         9753          Sue  Visits_Feb  0
8         9753          Sue  Visits_Mar  4
9         8453          Tom  Visits_Jan  4
10        8453          Tom  Visits_Feb  1
11        8453          Tom  Visits_Mar  1
12        4872        James  Visits_Jan  2
13        4872        James  Visits_Feb  1
14        4872        James  Visits_Mar  2

PS:使用.rename(columns={})

更改列名称

答案 1 :(得分:2)

使用df.melt

df.melt(id_vars=['ID_Patient', 'Name_Patient'],
        var_name='Month_of_visit', value_name='Col1')
#     ID_Patient Name_Patient Month_of_visit  Col1
# 0        11132          Jim     Visits_Feb     5
# 1         2755         Jack     Visits_Feb     0
# 2         9753          Sue     Visits_Feb     0
# 3         8453          Tom     Visits_Feb     1
# 4         4872        James     Visits_Feb     1
# 5        11132          Jim     Visits_Jan     2
# 6         2755         Jack     Visits_Jan     1
# 7         9753          Sue     Visits_Jan     0
# 8         8453          Tom     Visits_Jan     4
# 9         4872        James     Visits_Jan     2
# 10       11132          Jim     Visits_Mar     0
# 11        2755         Jack     Visits_Mar     0
# 12        9753          Sue     Visits_Mar     4
# 13        8453          Tom     Visits_Mar     1
# 14        4872        James     Visits_Mar     2