在pandas DataFrame CSV中堆叠列

时间:2013-09-05 00:43:27

标签: python csv pandas

我有一个数据框,它是包含列的数据透视表的结果:

(best buy, count)                                       753  non-null values
(best buy, mean)                                        753  non-null values
(best buy, min)                                         753  non-null values
(best buy, max)                                         753  non-null values
(best buy, std)                                         750  non-null values
(amazon, count)                                         662  non-null values
(amazon, mean)                                          662  non-null values
(amazon, min)                                           662  non-null values
(amazon, max)                                           662  non-null values
(amazon, std)                                           661  non-null values

如果我将其发送到csv文件,我最终会看到这样的内容(截断)

        (best buy, count) (best buy, mean) (best buy, max)
laptop         5                 10               12
tv             10                23               34

依此类推。

我是否有办法操纵数据框,以便创建的csv看起来如下所示?

             best buy          best buy         best buy
              count             mean             max
laptop         5                 10               12
tv             10                23               34

1 个答案:

答案 0 :(得分:2)

您可以将tupleize_cols=False传递给DataFrame.to_csv()

In [60]: df = DataFrame(poisson(50, size=(10, 2)), columns=['laptop', 'tv'])

In [61]: df
Out[61]:
   laptop  tv
0      48  57
1      48  45
2      48  49
3      61  47
4      49  47
5      45  65
6      49  40
7      58  39
8      46  65
9      43  53

In [62]: df['store'] = np.random.choice(['best_buy', 'amazon'], len(df))

In [63]: df
Out[63]:
   laptop  tv     store
0      48  57  best_buy
1      48  45  best_buy
2      48  49  best_buy
3      61  47  best_buy
4      49  47    amazon
5      45  65    amazon
6      49  40    amazon
7      58  39  best_buy
8      46  65    amazon
9      43  53  best_buy

In [64]: res = df.groupby('store').agg(['mean', 'std', 'min', 'max']).T

In [65]: res
Out[65]:
store        amazon  best_buy
laptop mean  47.250    51.000
       std    2.062     6.928
       min   45.000    43.000
       max   49.000    61.000
tv     mean  54.250    48.333
       std   12.738     6.282
       min   40.000    39.000
       max   65.000    57.000

In [66]: u = res.unstack()

In [67]: u
Out[67]:
store   amazon                    best_buy
          mean     std  min  max      mean    std  min  max
laptop   47.25   2.062   45   49    51.000  6.928   43   61
tv       54.25  12.738   40   65    48.333  6.282   39   57

In [68]: u.to_csv('the_csv.csv', tupleize_cols=False, sep='\t')

In [69]: cat the_csv.csv
store   amazon  amazon  amazon  amazon  best_buy        best_buy        best_buy        best_buy
        mean    std     min     max     mean    std     min     max

laptop  47.25   2.0615528128088303      45.0    49.0    51.0    6.928203230275509       43.0    61.0
tv      54.25   12.737739202856996      40.0    65.0    48.333333333333336      6.282250127674532       39.0    57.0