从元组列表

时间:2016-03-14 11:20:58

标签: pandas tuples dataframe

我见过几个相似的帖子,但他们并没有真正帮助我,因此新帖子。

我想从元组列表中创建下面的df:

Values         Total  extra
label                      
Pictionary  0.000000     12
Chess       4.609929     12
Cluedo      8.421986     12

以下是实现这一目标的所有组件:

columns = ['Total, 'extra']

tups = [(u'Pictionary', 0.0, 12)
        (u'Chess', 4.6099290780141837, 12)
        (u'Cluedo', 8.4219858156028362, 12)]

我失败的尝试:

pd.DataFrame(tups, columns=columns)

错误讯息:

AssertionError: 2 columns passed, passed data had 3 columns

2 个答案:

答案 0 :(得分:8)

我认为您必须向列list添加一个值,然后尝试list comprehension,然后使用第一列set_index,如果需要第一列为index

import pandas as pd

columns = ['label', 'Total', 'extra']

tups = [(u'Pictionary', 0.0, 12),
        (u'Chess', 4.6099290780141837, 12),
        (u'Cluedo', 8.4219858156028362, 12)]

df = pd.DataFrame([x for x in tups], columns=columns)

print df
        label     Total  extra
0  Pictionary  0.000000     12
1       Chess  4.609929     12
2      Cluedo  8.421986     12

df = df.set_index('label')
#if you need set column name
df.columns.name = 'Values'

print df
Values         Total  extra
label                      
Pictionary  0.000000     12
Chess       4.609929     12
Cluedo      8.421986     12

或者您可以使用Colonel Beauvel的{​​{3}}解决方案:

import pandas as pd

columns = ['Total', 'extra']

tups = [(u'Pictionary', 0.0, 12),
        (u'Chess', 4.6099290780141837, 12),
        (u'Cluedo', 8.4219858156028362, 12)]

df = pd.DataFrame(tups, columns=['label']+columns)
print df
        label     Total  extra
0  Pictionary  0.000000     12
1       Chess  4.609929     12
2      Cluedo  8.421986     12

df = df.set_index('label')
df.columns.name = 'Values'
print df
Values         Total  extra
label                      
Pictionary  0.000000     12
Chess       4.609929     12
Cluedo      8.421986     12

答案 1 :(得分:2)

您可以使用pandas.DataFrame.from_records()

import pandas as pd

data = [(1,2,3),
        (4,5,6),
        (7,8,9)]

col_names = ['Col0', 'Col1', 'Col2']
row_names = ['Row0', 'Row1', 'Row2']

df = pd.DataFrame.from_records(data, columns=col_names, index=row_names)

print(df)

      Col0  Col1  Col2
Row0     1     2     3
Row1     4     5     6
Row2     7     8     9