从n个m列数据帧创建m个n列数据帧

时间:2019-08-03 13:55:18

标签: python pandas

将m个n列数据帧的列表转换为n个m列数据帧的列表的最干净的方法是什么?具体来说,我希望第一个n列数据帧包含m列数据帧的所有第一列,第二个n列数据帧包含m列数据帧的所有第二列,依此类推。同时,我想为列分配新名称。

m, n = 3, 2
dfs = [
    pd.DataFrame(pd.np.random.randint(1, 10, (4, m)), columns=["a", "b", "c"])
    for _ in range(n)
]
# dfi1
   a  b  c
0  2  7  9
1  9  4  3
2  1  6  1
3  7  7  2
# dfi2
   a  b  c
0  5  6  2
1  8  7  1
2  2  8  5
3  9  6  1

目标:

# dfo1
  foo bar
0  2  5
1  9  8
2  1  2
3  7  9
# dfo2
  foo bar
0  7  6
1  4  7
2  6  8
3  7  6
# dfo3
  foo bar
0  9  2
1  3  1
2  1  5
3  2  1

可能有比两个冗长的嵌套循环更好的方法吗?

1 个答案:

答案 0 :(得分:1)

简短答案

  DEFINE VARIABLE de_part_obj           AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE de_product_family_obj AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE de_shipping_info_obj  AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE dt_sched_date_from    AS DATE       NO-UNDO.
  DEFINE VARIABLE dt_sched_date_to      AS DATE       NO-UNDO.
  DEFINE VARIABLE de_word_obj           AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE de_seq_no             AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE de_intseq             AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE de_ordno_obj          AS DECIMAL    NO-UNDO.
  DEFINE VARIABLE de_wolv_obj           AS DECIMAL    NO-UNDO. 
  DEFINE VARIABLE cPipeLinekey          AS CHARACTER  NO-UNDO.
  DEFINE VARIABLE cPipeLinestatus       AS CHARACTER  NO-UNDO.


  ASSIGN de_part_obj           = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_part)
         de_product_family_obj = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_product_family)
         de_shipping_info_obj  = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_shipping_info)
         cPipeLinekey          = fi_PipeLineKey:SCREEN-VALUE IN FRAME {&FRAME-NAME}
         cPipeLinestatus       = fi_PipeLineStatus:SCREEN-VALUE IN FRAME {&FRAME-NAME}
         de_word_obj           = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_worderid)
         de_seq_no             = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_Seq)     
         de_intseq             = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_intseq)
         de_ordno_obj          = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_ordno)
         de_wolv_obj           = DYNAMIC-FUNCTION('getKeyFieldValue'  IN h_dynlookup_wolvid).
.
.
.
.
/* AND SO ON*/
 IF de_part_obj <> 0.0 AND de_product_family_obj = 0.0 AND de_shipping_info_obj = 0.0 AND de_word_obj = 0.0 AND de_seq_no = 0.0 AND de_intseq = 0.0 AND de_ordno_obj = 0.0
               AND de_wolv_obj = 0.0 AND cPipeLinekey = ? AND cPipeLinestatus = ? THEN TRUE /* Allow only de_part_obj <> 0.0 */

 IF de_part_obj = 0.0 AND de_product_family_obj <> 0.0 AND de_shipping_info_obj = 0.0 AND de_word_obj = 0.0 AND de_seq_no = 0.0 AND de_intseq = 0.0 AND de_ordno_obj = 0.0
               AND de_wolv_obj = 0.0 AND cPipeLinekey = ? AND cPipeLinestatus = ? THEN TRUE /* Allow only de_product_family_obj <> 0.0 */
.
.
.
.

/* up to cPipeLinestatus <> ? */

逐步

df1 = pd.concat(dfs, keys=('foo','bar')).unstack(0)
dfs1 = [df1.xs(i, axis=1, level=0) for i in df1.columns.levels[0]]
# or
dfs1 = [df.droplevel(0, axis=1) for i, df in df1.groupby(axis=1, level=0)]

首先按元组列表使用concat,其键参数由具有相同大小的元组列表组成,例如长度为np.random.seed(2019) m, n = 3, 2 dfs = [ pd.DataFrame(pd.np.random.randint(1, 10, (4, m)), columns=["a", "b", "c"]) for _ in range(n) ] print (dfs) [ a b c 0 9 3 6 1 9 7 9 2 1 1 8 3 9 6 4, a b c 0 1 3 6 1 8 9 6 2 5 1 2 3 7 1 3] ,然后使用DataFrame.unstack为列中的n进行整形:

MultiIndex

然后创建df1 = pd.concat(dfs, keys=('foo','bar')).unstack(0) print (df1) a b c foo bar foo bar foo bar 0 9 1 3 3 6 6 1 9 8 7 9 9 6 2 1 5 1 1 8 2 3 9 7 6 1 4 3 个列表:

DataFrame

或者:

dfs1 = [df1.xs(i, axis=1, level=0) for i in df1.columns.levels[0]]
print (dfs1)
[   foo  bar
0    9    1
1    9    8
2    1    5
3    9    7,    foo  bar
0    3    3
1    7    9
2    1    1
3    6    1,    foo  bar
0    6    6
1    9    6
2    8    2
3    4    3]