pandas.DataFrame可以有列表类型列吗?

时间:2015-12-10 09:01:22

标签: python pandas

是否可以创建包含列表类型字段的pandas.DataFrame?

例如,我想将以下csv加载到pandas.DataFrame:

id,scores
1,"[1,2,3,4]"
2,"[1,2]"
3,"[0,2,4]"

2 个答案:

答案 0 :(得分:4)

剥去双引号:

id,scores
1, [1,2,3,4]
2, [1,2]
3, [0,2,4]

你应该能够做到这一点:

query = [[1, [1,2,3,4]], [2, [1,2]], [3, [0,2,4]]]
df = pandas.DataFrame(query, columns=['id', 'scores'])
print df

答案 1 :(得分:1)

您可以使用:

import pandas as pd
import io

temp=u'''id,scores  
1,"[1,2,3,4]"
2,"[1,2]"
3,"[0,2,4]"'''

df = pd.read_csv(io.StringIO(temp), sep=',', index_col=[0] )
print df
     scores  
id           
1   [1,2,3,4]
2       [1,2]
3     [0,2,4]

但是列分数的dtype是object,而不是列表。

一种方法使用astconverters

import pandas as pd
import io
from ast import literal_eval

temp=u'''id,scores
1,"[1,2,3,4]"
2,"[1,2]"
3,"[0,2,4]"'''

def converter(x):
    #define format of datetime
    return literal_eval(x)

#define each column
converters={'scores': converter}

df = pd.read_csv(io.StringIO(temp), sep=',', converters=converters)
print df
   id        scores
0   1  [1, 2, 3, 4]
1   2        [1, 2]
2   3     [0, 2, 4]

#check lists:
print 2 in df.scores[2]
#True

print 1 in df.scores[2]
#False