如何将列表转换为具有多列的数据框?

时间:2019-08-12 21:09:19

标签: python pandas list dataframe

我有一个像这样的列表“ result1”:

[[("tt0241527-Harry Potter and the Philosopher's Stone", 1.0),
  ('tt0330373-Harry Potter and the Goblet of Fire', 0.9699),
  ('tt1843230-Once Upon a Time', 0.9384),
  ('tt0485601-The Secret of Kells', 0.9347)]]

我想将其转换为三列数据框,

pd.DataFrame(result1)

但这不是我想要的

enter image description here

预期结果:

    Number             Title                          Value
 tt0241527  Harry Potter and the Philosopher's Stone   1.0
 tt0330373  Harry Potter and the Goblet of Fire       0.9699

3 个答案:

答案 0 :(得分:4)

您可以尝试重新输入:

[elt1.split('-') + [elt2] for elt1, elt2 in result1[0] ]

完整示例:

result1 = [[("tt0241527-Harry Potter and the Philosopher's Stone", 1.0),
  ('tt0330373-Harry Potter and the Goblet of Fire', 0.9699),
  ('tt1843230-Once Upon a Time', 0.9384),
  ('tt0485601-The Secret of Kells', 0.9347)]]
# Columns name in dataframe
columns_name = ["Number", "Title", "Value"]

data = [elt1.split('-') + [elt2] for elt1, elt2 in result1[0] ]
print(data)
# [['tt0241527', "Harry Potter and the Philosopher's Stone", 1.0], 
#  ['tt0330373', 'Harry Potter and the Goblet of Fire', 0.9699],
#  ['tt1843230', 'Once Upon a Time', 0.9384],
#  ['tt0485601', 'The Secret of Kells', 0.9347]]

df = pd.DataFrame(data, columns=columns_name)
print(df)
#       Number                                     Title   Value 
# 0  tt0241527  Harry Potter and the Philosopher's Stone  1.0000
# 1  tt0330373       Harry Potter and the Goblet of Fire  0.9699
# 2  tt1843230                          Once Upon a Time  0.9384
# 3  tt0485601                       The Secret of Kells  0.9347

答案 1 :(得分:3)

尝试:

results = [[("tt0241527-Harry Potter and the Philosopher's Stone", 1.0),
  ('tt0330373-Harry Potter and the Goblet of Fire', 0.9699),
  ('tt1843230-Once Upon a Time', 0.9384),
  ('tt0485601-The Secret of Kells', 0.9347)]]

df = pd.DataFrame().from_records(results[0])

df[[3,4]] = df[0].str.split('-', expand=True)

print(df)

输出:

                                                   0       1          3                                         4
0  tt0241527-Harry Potter and the Philosopher's S...  1.0000  tt0241527  Harry Potter and the Philosopher's Stone
1      tt0330373-Harry Potter and the Goblet of Fire  0.9699  tt0330373       Harry Potter and the Goblet of Fire
2                         tt1843230-Once Upon a Time  0.9384  tt1843230                          Once Upon a Time
3                      tt0485601-The Secret of Kells  0.9347  tt0485601                       The Secret of Kells

答案 2 :(得分:1)

可以使用它,因此您可以在列表中添加任意数量的变量,而不会在代码中出现任何问题:

import pandas as pd
    result1 = [[("tt0241527-Harry Potter and the Philosopher's Stone", 1.0),
      ('tt0330373-Harry Potter and the Goblet of Fire', 0.9699),
      ('tt1843230-Once Upon a Time', 0.9384),
      ('tt0485601-The Secret of Kells', 0.9347)]]
    d = []
    for i in range(0,len(result1[0])):
        c = result1[0][i][0].split('-')
        c.append(restul1[0][i][1])
        d.append(c)
    df = pd.DataFrame(d)
    print(df.head())

输出:

           0                                         1       2
0  tt0241527  Harry Potter and the Philosopher's Stone  1.0000
1  tt0330373       Harry Potter and the Goblet of Fire  0.9699
2  tt1843230                          Once Upon a Time  0.9384
3  tt0485601                       The Secret of Kells  0.9347

最后要重命名的列添加:

df.columns = ['Number','Title','Value']
print(df.head())

您会得到:

      Number                                     Title   Value
0  tt0241527  Harry Potter and the Philosopher's Stone  1.0000
1  tt0330373       Harry Potter and the Goblet of Fire  0.9699
2  tt1843230                          Once Upon a Time  0.9384
3  tt0485601                       The Secret of Kells  0.9347