熊猫将重复的列转换为行

时间:2020-10-11 12:52:57

标签: python pandas dataframe

我有一个像这样的数据框,它具有重复的列名:ID作为索引加载

          JANUARY         FEBRUARY        MARCH 
  ID    Sales   Revenue Sales   Revenue Sales   Revenue
  03    10.00   5.00    0.00    0.00    10.00   19.00
  05    20.00   20.00   20.00   20.00   20.00   20.00
  06    30.00   30.00   30.00   30.00   30.00   30.00
  07    30.00   30.00   30.00   30.00   30.00   30.00

我想显示如下:

  ID    Sales   Revenue
  03    10.00   5.00
  05    20.00   20.00
  06    30.00   30.00
  07    30.00   30.00
  03    0.00    0.00
  05    20.00   20.00
  06    30.00   30.00
  07    30.00   30.00
  03    10.00   19.00
  05    20.00   20.00
  06    30.00   30.00
  07    30.00   30.00

当前我正在使用,但是期望有更好的方法。我尝试过融化,但这仅适用于一栏:

cols = df.columns.to_list()
for i in range(1, len(cols), 2):  # #Loading each month's data to the data frame
    sub_cols = cols[i:i + 2]
    sub_cols .insert(0, cols[0])
    sub_df = df.filter(sub_cols , axis=1)
    sub_df.columns = ['ID', 'Revenue', 'Sales']
    if i == 1:
        final_df = sub_df
    else:
        final_df = final_df.append(sub_df)

2 个答案:

答案 0 :(得分:0)

这是堆叠列的另一种方法。不知道它是否更有效,但是所需的代码更少。

#        JANUARY         FEBRUARY        MARCH 
#  ID    Sales   Revenue Sales   Revenue Sales   Revenue
#  03    10.00   5.00    0.00    0.00    10.00   19.00
#  05    20.00   20.00   20.00   20.00   20.00   20.00
#  06    30.00   30.00   30.00   30.00   30.00   30.00
#  07    30.00   30.00   30.00   30.00   30.00   30.00

import pandas as pd
dd = {
'ID':['03','05','06','07'],
'Sales1':[10,20,30,30],
'Rev1':[5,20,30,30],
'Sales2':[0,20,30,30],
'Rev2':[0,20,30,30],
'Sales3':[10,20,30,30],
'Rev3':[19,20,30,30]
}

df = pd.DataFrame(dd)
print(df.to_string(index=False),'\n') # source dataframe

####################

dfnew = pd.DataFrame(columns = ['ID', 'Sales', 'Revenue'])  # new dataframe with all data
for c in range(1,len(df.columns),2):
   dftmp = df[['ID',df.columns[c],df.columns[c+1]]] # create df for each month
   dftmp.columns = ['ID', 'Sales', 'Revenue'] # must rename columns for append
   dfnew = dfnew.append(dftmp)  # append to stacked df

print(dfnew.to_string(index=False))

输出

 ID  Sales1  Rev1  Sales2  Rev2  Sales3  Rev3
 03      10     5       0     0      10    19
 05      20    20      20    20      20    20
 06      30    30      30    30      30    30
 07      30    30      30    30      30    30

 ID Sales Revenue
 03    10       5
 05    20      20
 06    30      30
 07    30      30
 03     0       0
 05    20      20
 06    30      30
 07    30      30
 03    10      19
 05    20      20
 06    30      30
 07    30      30

答案 1 :(得分:0)

Pandas lreshape为我成功了。

version: '3'
services:
  A:
   volumes:
      - '/mnt/data:/app'
   deploy:
     placement:
       constraints:
         - "node.labels.host==A"  
  B:
   volumes:
      - '/mnt/data:/app'
   deploy:
     placement:
       constraints:
         - "node.labels.host==B"