将没有元组的列表转换为数据框

时间:2020-09-02 21:05:53

标签: python pandas list

通常,当您要创建将一组数据转换为数据框时,请为每列创建一个列表,从这些列表中创建一个字典,然后从该字典中创建一个数据框。

我要创建的数据框有75列,所有列均具有相同的行数。一张一张地定义清单是行不通的。取而代之的是,我决定制作一个列表,并将每行的特定块迭代地放入数据框。 在这里,我将举一个将列表转换为数据框的示例:

lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Example list

df = 
   a b c d e
0  0 2 4 6 8
1  1 3 5 7 9

# Result I want from the example list

这是我的测试代码:

import pandas as pd
import numpy as np

dict = {'a':[], 'b':[], 'c':[], 'd':[], 'e':[]}
df = pd.DataFrame(dict)

# Here is my test data frame, it contains 5 columns and no rows.

lst = np.arange(10).tolist()

# This is my test list, it looks like this lst = [0, 2, …, 9]

for i in range(len(lst)):
    df.iloc[:, i] = df.iloc[:, i]\
    .append(pd.Series(lst[2 * i:2 * i + 2]))

# This code is supposed to put two entries per column for the whole data frame.
# For the first column, i = 0, so [2 * (0):2 * (0) + 2] = [0:2]
# df.iloc[:, 0] = lst[0:2], so df.iloc[:, 0] = [0, 1]
# Second column i = 1, so [2 * (1):2 * (1) + 2] = [2:4]
# df.iloc[:, 1] = lst[2:4], so df.iloc[:, 1] = [2, 3]
# This is how the code was supposed to allocate lst to df.
# However it outputs an error.

运行此代码时出现此错误:

ValueError: cannot reindex from a duplicate axis

当我添加ignore_index = True以使自己拥有

for i in range(len(lst)):
    df.iloc[:, i] = df.iloc[:, i]\
    .append(pd.Series(lst[2 * i:2 * i + 2]), ignore_index = True)

我收到此错误:

IndexError: single positional indexer is out-of-bounds

运行代码后,我检查df的结果。无论是否忽略索引,输出都是相同的。

In: df
Out:
   a   b   c   d   e
0  0 NaN NaN NaN NaN
1  1 NaN NaN NaN NaN

似乎第一个循环运行正常,但是在尝试填充第二个列时发生错误。

有人知道如何使它工作吗?谢谢。

1 个答案:

答案 0 :(得分:0)

IIUC:

lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
alst = np.array(lst)
df = pd.DataFrame(alst.reshape(2,-1, order='F'), columns = [*'abcde'])
print(df)

输出:

   a  b  c  d  e
0  0  2  4  6  8
1  1  3  5  7  9