我有想要转换为pandas Dataframe的列表的python列表。我想以以下格式创建数据框:
table_id created Mb (etc.)
1 NetworkClicks 2018-10-26 0.22
2 NetworkImpressions 2018-10-26 1519.24
(总共6行,基于下面的列表示例)
列名在每个列表中,例如MB,已创建,已修改,table_id。
列出示例:
ls_all = [
[(u'Mb', u'928.11'), (u'created', datetime.date(2018, 10, 25)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'4,378'), (u'table_id', u'NetworkActiveViews'), (u'Tb', u'0.91')],
[(u'Mb', u'800.67'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'3,577'), (u'table_id', u'NetworkBackfillActiveViews'), (u'Tb', u'0.78')],
[(u'Mb', u'2.44'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'11'), (u'table_id', u'NetworkBackfillClicks'), (u'Tb', u'0.00')],
[(u'Mb', u'1190.52'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'5,269'), (u'table_id', u'NetworkBackfillImpressions'), (u'Tb', u'1.16')],
[(u'Mb', u'0.22'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'1'), (u'table_id', u'NetworkClicks'), (u'Tb', u'0.00')],
[(u'Mb', u'1519.24'), (u'created', datetime.date(2018, 10, 26)), (u'modified', datetime.date(2019, 4, 18)), (u'Rows_Mil', u'7,089'), (u'table_id', u'NetworkImpressions'), (u'Tb', u'1.48')]
]
我尝试过
df = pd.DataFrame(ls_all, columns=ls_all[0])
但是它给了我这个数据帧:
(Mb, 928.11) ... (Tb, 0.91)
0 (Mb, 928.11) ... (Tb, 0.91)
1 (Mb, 800.67) ... (Tb, 0.78)
2 (Mb, 2.44) ... (Tb, 0.00)
3 (Mb, 1190.52) ... (Tb, 1.16)
4 (Mb, 0.22) ... (Tb, 0.00)
5 (Mb, 1519.24) ... (Tb, 1.48)
答案 0 :(得分:3)
使用字典列表而不是元组列表。
list_of_dicts = [dict(x) for x in ls_all]
df = pd.DataFrame(list_of_dicts)
Mb Rows_Mil Tb created modified table_id
0 928.11 4,378 0.91 2018-10-25 2019-04-18 NetworkActiveViews
1 800.67 3,577 0.78 2018-10-26 2019-04-18 NetworkBackfillActiveViews
2 2.44 11 0.00 2018-10-26 2019-04-18 NetworkBackfillClicks
3 1190.52 5,269 1.16 2018-10-26 2019-04-18 NetworkBackfillImpressions
4 0.22 1 0.00 2018-10-26 2019-04-18 NetworkClicks
答案 1 :(得分:0)
我喜欢上面的词典列表,这是另一种方式:
lists = []
for list in ls_all:
temp = [x[1] for x in list]
lists.append(temp)
columns = [x[0] for x in ls_all[0]]
df = pd.DataFrame(lists, columns=columns)
Mb created modified Rows_Mil table_id Tb
0 928.11 2018-10-25 2019-04-18 4,378 NetworkActiveViews 0.91
1 800.67 2018-10-26 2019-04-18 3,577 NetworkBackfillActiveViews 0.78
2 2.44 2018-10-26 2019-04-18 11 NetworkBackfillClicks 0.00
3 1190.52 2018-10-26 2019-04-18 5,269 NetworkBackfillImpressions 1.16
4 0.22 2018-10-26 2019-04-18 1 NetworkClicks 0.00
5 1519.24 2018-10-26 2019-04-18 7,089 NetworkImpressions 1.48