Question

我有以下字典：

print(d)

{1: ([4, 3, 2], [10.0, 6.666666666666667, 7.5]),
 2: ([4, 3, 2], [6.0, 6.666666666666667, 8.5]),
 3: ([4, 3, 2], [26.0, 29.666666666666668, 7.5])}

我想转换成pandas df。我还想指定列名。 Col1 应该是字典的键。输出应如下所示（不需要四舍五入）：

col1  col2  col3  
 1     4     10
 1     3     6.6
 1     2     7.5
 2     4     6
 2     3     6.6
 2     2     8.5
 3     4     26
 3     3     29.6
 3     2     7.5

我试过了：

pd.DataFrame.from_dict(d, orient='index')

但这会导致 df 将列表作为列值

    0                                       1
1   [4, 3, 2]   [10.0, 6.666666666666667, 7.5]
2   [4, 3, 2]   [6.0, 6.666666666666667, 8.5]
3   [4, 3, 2]   [26.0, 29.666666666666668, 7.5]
4   [4, 3, 2]   [5.25, 5.333333333333333, 6.0]

Answer 1

我们可以将字典展平以创建表示数据框行的三元组

df = pd.DataFrame([(k, *t) for k, v in d.items() for t in zip(*v)])

   0  1          2
0  1  4  10.000000
1  1  3   6.666667
2  1  2   7.500000
3  2  4   6.000000
4  2  3   6.666667
5  2  2   8.500000
6  3  4  26.000000
7  3  3  29.666667
8  3  2   7.500000

Answer 2

您需要 explode 数据框 -

df  = pd.DataFrame(d).T.apply(pd.Series.explode).reset_index()

输出 -

   index  0          1
0      1  4       10.0
1      1  3   6.666667
2      1  2        7.5
3      2  4        6.0
4      2  3   6.666667
5      2  2        8.5
6      3  4       26.0
7      3  3  29.666667
8      3  2        7.5

然后重命名列使用->

df.columns = ['col1','col2','col3']

Answer 3

您可以通过连接两个列表并使用 pd.Series

一次“分解”这两个列表

d = {1: ([4, 3, 2], [10.0, 6.666666666666667, 7.5]),
 2: ([4, 3, 2], [6.0, 6.666666666666667, 8.5]),
 3: ([4, 3, 2], [26.0, 29.666666666666668, 7.5])}

df = pd.DataFrame.from_dict(d, orient='index')

df.apply(lambda r: pd.Series(np.concatenate(list(r)), index=np.repeat(r.index,len(r))))

<头>

	0	1
1	4	10
1	3	6.66667
1	2	7.5
2	4	6
2	3	6.66667
2	2	8.5
3	4	26
3	3	29.6667
3	2	7.5

将带有项目列表的字典转换为熊猫数据框

3 个答案: