从未知数量的列列表创建pandas数据帧

时间:2017-09-19 07:09:06

标签: python pandas dataframe

为了便于说明,我将我的要求限制在5列和3行。我的列标题将变为字符串,我的行将变为字符串。我希望将所有行添加到数据框中。这是我试过的

import pandas as pd

Column_Header = "Col1,Col2,Col3,Col4,Col5" # We have upto 500 columns
df = pd.DataFrame(columns=Column_Header.split(","))


#we will get upto 100000 rows from a server response
Row1 = "Val11,Val12,Val13,Val14,Val15"
Row2 = "Val21,Val22,Val23,Val124,Val25"
Row3 = "Val31,Val32,Val33,Val34,Val35"
df_temp = pd.DataFrame(data = Row1.split(",") , columns = Column_Header.split(","))
pd.concat(df,df_temp)
print(pd)

2 个答案:

答案 0 :(得分:2)

最好也是最快的是list comprehension创建所有数据的列表,只调用DataFrame构造函数一次:

Column_Header = "Col1,Col2,Col3,Col4,Col5"
Row1 = "Val11,Val12,Val13,Val14,Val15"
Row2 = "Val21,Val22,Val23,Val124,Val25"
Row3 = "Val31,Val32,Val33,Val34,Val35"

rows = [Row1,Row2,Row3]
L = [x.split(',') for x in rows]

print (L)
[['Val11', 'Val12', 'Val13', 'Val14', 'Val15'], 
 ['Val21', 'Val22', 'Val23', 'Val124', 'Val25'],
 ['Val31', 'Val32', 'Val33', 'Val34', 'Val35']]


df = pd.DataFrame(data = L , columns = Column_Header.split(","))
print (df)
    Col1   Col2   Col3    Col4   Col5
0  Val11  Val12  Val13   Val14  Val15
1  Val21  Val22  Val23  Val124  Val25
2  Val31  Val32  Val33   Val34  Val35

答案 1 :(得分:1)

如果这是一个可行的选项,将所有数据保留为pd.read_csv会更简单。将所有字符串转换为单个多行字符串,并将其通过StringIO缓冲区传递给read_csv

import io    
data = '\n'.join([Column_Header, Row1, Row2, Row3])
df = pd.read_csv(io.StringIO(data))
df

    Col1   Col2   Col3    Col4   Col5
0  Val11  Val12  Val13   Val14  Val15
1  Val21  Val22  Val23  Val124  Val25
2  Val31  Val32  Val33   Val34  Val35

如果您使用的是python2.x,则io模块可用作cStringIO模块,因此您必须将其导入为:

import cStringIO as io