如何使用pandas编码来转换名为df1
的数据帧index,client1,client2
name,bob,erika
email,gmail,yahoo
house_A,Paris,London
house_B,London,Milan
house_C,Berlin,Paris
code_name_A,Vaugirard,Windsor
code_name_B,Great,Brera
code_name_C,Mauer,Elysee
visa_id_num_A,FR001B,UK001E
visa_id_num_B,UK001B,IT001E
visa_id_num_C,GE001B,FR001E
food_A,apples,burgers
food_B,bananas,fries
food_C,burgers,pizzas
food_D,fries,oranges
food_E,pizzas,pears
这个名为df2的数据框
index,FR001B,UK001B,GE001B,UK001E,IT001E,FR001E
client_number,client1,client1,client1,client2,client2,client2
name,bob,bob,bob,erika,erika,erika
email,gmail,gmail,gmail,yahoo,yahoo,yahoo
house,Paris,London,Berlin,London,Milan,Paris
code_name,Vaugirard,Great,Mauer,Windsor,Brera,Elysee
visa_id_num,FR001B,UK001B,GE001B,UK001E,IT001E,FR001E
food_A,apples,apples,apples,burgers,burgers,burgers
food_B,bananas,bananas,bananas,fries,fries,fries
food_C,burgers,burgers,burgers,pizzas,pizzas,pizzas
food_D,fries,fries,fries,oranges,oranges,oranges
food_E,pizzas,pizzas,pizzas,pears,pears,pears
我需要拆分索引值并用新值替换特定值。我尝试使用堆栈,取消堆栈和groupby,但它很混乱。
非常感谢提前
答案 0 :(得分:2)
让我们尝试使用T
,pd.wide_to_long
来处理多个“融化”和set_index
:
df1T = df1.T.reset_index().rename(columns={'index':'client_number'})
df1w = pd.wide_to_long(df1T,
['house','code_name','visa_id_num'],
['client_number','name','email',
'food_A','food_B',
'food_C','food_D','food_E'],
'code', '_', '\w+')
df2 = df1w.reset_index().set_index('visa_id_num').T
print(df2)
输出:
visa_id_num FR001B UK001B GE001B UK001E IT001E FR001E
client_number client1 client1 client1 client2 client2 client2
name bob bob bob erika erika erika
email gmail gmail gmail yahoo yahoo yahoo
food_A apples apples apples burgers burgers burgers
food_B bananas bananas bananas fries fries fries
food_C burgers burgers burgers pizzas pizzas pizzas
food_D fries fries fries oranges oranges oranges
food_E pizzas pizzas pizzas pears pears pears
code A B C A B C
house Paris London Berlin London Milan Paris
code_name Vaugirard Great Mauer Windsor Brera Elysee