我有一个字符串数组,其中此数组的每个元素都是csv文件的行(以逗号分隔)。我想把它转换成一个pandas Dataframe。但是当我逐行尝试时它很慢。除了writelines()后跟pandas.read_csv(),还可以提出一个更快的替代方法吗?
答案 0 :(得分:0)
CSV
导入在pandas中,您可以一次阅读整个csv
而不会在线上循环。
使用read_csv和filename作为参数:
import pandas as pd
from cStringIO import StringIO
# Set up fake csv data as test for example only
fake_csv = '''
Col_0,Col_1,Col_2,Col_3
0,0.5,A,123
1,0.2,J,234
2,1.4,F,345
3,0.7,E,456
4,0.4,G,576
5,0.8,T,678
6,1.6,A,789
'''
# Read in whole csv to DataFrame at once
# StringIO is for example only
# Normally you would load your file with
# df = pd.read_csv('/path/to/your/file.csv')
df = pd.read_csv(StringIO(fake_csv))
print 'DataFrame from CSV:'
print df
DataFrame from CSV:
Col_0 Col_1 Col_2 Col_3
0 0 0.5 A 123
1 1 0.2 J 234
2 2 1.4 F 345
3 3 0.7 E 456
4 4 0.4 G 576
5 5 0.8 T 678
6 6 1.6 A 789