我有一个数据框,它是包含列的数据透视表的结果:
(best buy, count) 753 non-null values
(best buy, mean) 753 non-null values
(best buy, min) 753 non-null values
(best buy, max) 753 non-null values
(best buy, std) 750 non-null values
(amazon, count) 662 non-null values
(amazon, mean) 662 non-null values
(amazon, min) 662 non-null values
(amazon, max) 662 non-null values
(amazon, std) 661 non-null values
如果我将其发送到csv
文件,我最终会看到这样的内容(截断)
(best buy, count) (best buy, mean) (best buy, max)
laptop 5 10 12
tv 10 23 34
依此类推。
我是否有办法操纵数据框,以便创建的csv
看起来如下所示?
best buy best buy best buy
count mean max
laptop 5 10 12
tv 10 23 34
答案 0 :(得分:2)
您可以将tupleize_cols=False
传递给DataFrame.to_csv()
:
In [60]: df = DataFrame(poisson(50, size=(10, 2)), columns=['laptop', 'tv'])
In [61]: df
Out[61]:
laptop tv
0 48 57
1 48 45
2 48 49
3 61 47
4 49 47
5 45 65
6 49 40
7 58 39
8 46 65
9 43 53
In [62]: df['store'] = np.random.choice(['best_buy', 'amazon'], len(df))
In [63]: df
Out[63]:
laptop tv store
0 48 57 best_buy
1 48 45 best_buy
2 48 49 best_buy
3 61 47 best_buy
4 49 47 amazon
5 45 65 amazon
6 49 40 amazon
7 58 39 best_buy
8 46 65 amazon
9 43 53 best_buy
In [64]: res = df.groupby('store').agg(['mean', 'std', 'min', 'max']).T
In [65]: res
Out[65]:
store amazon best_buy
laptop mean 47.250 51.000
std 2.062 6.928
min 45.000 43.000
max 49.000 61.000
tv mean 54.250 48.333
std 12.738 6.282
min 40.000 39.000
max 65.000 57.000
In [66]: u = res.unstack()
In [67]: u
Out[67]:
store amazon best_buy
mean std min max mean std min max
laptop 47.25 2.062 45 49 51.000 6.928 43 61
tv 54.25 12.738 40 65 48.333 6.282 39 57
In [68]: u.to_csv('the_csv.csv', tupleize_cols=False, sep='\t')
In [69]: cat the_csv.csv
store amazon amazon amazon amazon best_buy best_buy best_buy best_buy
mean std min max mean std min max
laptop 47.25 2.0615528128088303 45.0 49.0 51.0 6.928203230275509 43.0 61.0
tv 54.25 12.737739202856996 40.0 65.0 48.333333333333336 6.282250127674532 39.0 57.0