我有一个pandas.DataFrame
,有2列。第一列是一个简单的整数,第二列是长度为50的numpy.array
。我想将这两列写入CSV文件,但是当我使用.to_csv()
并在Excel中打开文件时只显示值的子集并且可以访问。长度是可变的,当我在Excel中打开它时,我似乎得到一个相同字符串长度(或多或少)的列。
pandas.to_csv()是否对文件写了某种可视解释,而不是DataFrame
中的实际数据?
如何正确地将其写入CSV以便我可以在Excel中使用它?
答案 0 :(得分:1)
似乎第二列的50
值位于Excel
的每个行的一个单元格中。
我认为您可以使用apply
numpy array
和concat
第一列Series
创建a
列新列。上次写to_csv
:
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': [0,1,5], 'b': [np.arange(50), np.arange(50), np.arange(50)]} )
print df
a b
0 0 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
1 1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
2 5 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...
print df.b.apply(pd.Series)
0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47 \
0 0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47
1 0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47
2 0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47
48 49
0 48 49
1 48 49
2 48 49
df = pd.concat([df['a'], df.b.apply(pd.Series)], axis=1)
print df
[3 rows x 50 columns]
a 0 1 2 3 4 5 6 7 8 ... 40 41 42 43 44 45 46 47 48 49
0 0 0 1 2 3 4 5 6 7 8 ... 40 41 42 43 44 45 46 47 48 49
1 1 0 1 2 3 4 5 6 7 8 ... 40 41 42 43 44 45 46 47 48 49
2 5 0 1 2 3 4 5 6 7 8 ... 40 41 42 43 44 45 46 47 48 49
#for testing
print df.to_csv()
#write to file
#print df.to_csv('filename')
,a,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49
0,0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49
1,1,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49
2,5,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49
编辑:
如果您需要写DataFrame
to_excel
:
#write to excel, omit index of DataFrame
df.to_excel('test.xlsx', index=False)