我有一个数据框。行是独特的人,列是采取的各种行动类型。我需要重新构建数据以按行显示各个事件。这是我目前所需的格式,以及我尝试实施的方法。
current = pd.DataFrame({'name': {0: 'ross', 1: 'allen', 2: 'jon'},'action a': {0:'2017-10-04', 1:'2017-10-04', 2:'2017-10-04'},'action b': {0:'2017-10-05', 1:'2017-10-05', 2:'2017-10-05'},'action c': {0:'2017-10-06', 1:'2017-10-06', 2:'2017-10-06'}})
desired = pd.DataFrame({'name':['ross','ross','ross','allen','allen','allen','jon','jon','jon'],'action':['action a','action b','action c','action a','action b','action c','action a','action b','action c'],'date':['2017-10-04','2017-10-05','2017-10-05','2017-10-04','2017-10-05','2017-10-05','2017-10-04','2017-10-05','2017-10-05']})
答案 0 :(得分:1)
使用df.melt
(v0.20 +):
df
action a action b action c name
0 2017-10-04 2017-10-05 2017-10-06 ross
1 2017-10-04 2017-10-05 2017-10-06 allen
2 2017-10-04 2017-10-05 2017-10-06 jon
df = df.melt('name').sort_values('name')
df.columns = ['name', 'action', 'date']
df
name action date
1 allen action a 2017-10-04
4 allen action b 2017-10-05
7 allen action c 2017-10-06
2 jon action a 2017-10-04
5 jon action b 2017-10-05
8 jon action c 2017-10-06
0 ross action a 2017-10-04
3 ross action b 2017-10-05
6 ross action c 2017-10-06
答案 1 :(得分:1)
r = df.roles
c = df.roles.str.count(',') + 1
i = df.index
df.loc[i.repeat(c)].assign(roles=','.join(r).split(','))
company employer_id roles
0 a 1 engineer
0 a 1 data_scientist
0 a 1 architect
1 b 2 engineer
1 b 2 front_end_developer