从String数组快速转换为Pandas Dataframe

时间:2016-03-23 10:49:42

标签: pandas

我有一个字符串数组,其中此数组的每个元素都是csv文件的行(以逗号分隔)。我想把它转换成一个pandas Dataframe。但是当我逐行尝试时它很慢。除了writelines()后跟pandas.read_csv(),还可以提出一个更快的替代方法吗?

1 个答案:

答案 0 :(得分:0)

CSV导入

在pandas中,您可以一次阅读整个csv而不会在线上循环。

使用read_csv和filename作为参数:

import pandas as pd
from cStringIO import StringIO

# Set up fake csv data as test for example only
fake_csv = '''
Col_0,Col_1,Col_2,Col_3
0,0.5,A,123
1,0.2,J,234
2,1.4,F,345
3,0.7,E,456
4,0.4,G,576
5,0.8,T,678
6,1.6,A,789
'''

# Read in whole csv to DataFrame at once
# StringIO is for example only
# Normally you would load your file with
# df = pd.read_csv('/path/to/your/file.csv')
df = pd.read_csv(StringIO(fake_csv))

print 'DataFrame from CSV:'
print df
DataFrame from CSV:
   Col_0  Col_1 Col_2  Col_3
0      0    0.5     A    123
1      1    0.2     J    234
2      2    1.4     F    345
3      3    0.7     E    456
4      4    0.4     G    576
5      5    0.8     T    678
6      6    1.6     A    789