从列表python列表创建数据框

时间:2020-11-02 14:37:12

标签: python pandas list

我有列表(t):

[[{'CreationDate': b"D:20191125142104+05'00'",
   'Creator': b'PDF-XChange Editor 7.0.325.1',
   'ModDate': b"D:20191125142754+05'00'",
   'Producer': b'PDF-XChange Core API SDK (7.0.325.1)'}],
 [{'CreationDate': b"D:20200215153643+05'00'",
   'Creator': b'Adobe Acrobat 11.0.23',
   'ModDate': b"D:20200215191411+05'00'",
   'Producer': b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug-in'}],
 [{'CreationDate': b"D:20200215153532+05'00'",
   'Creator': b'Adobe Acrobat 11.0.23',
   'ModDate': b"D:20200215191426+05'00'",
   'Producer': b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug-in'}]]

我需要创建一个DataFrame,其中column = ['CreationDate','Creator','ModDate','Producer']。 我尝试下一步:pd.DataFrame(t,columns = ['CreationDate','Creator','ModDate','Producer'])),我得到了错误:

 4 columns passed, passed data had 1 columns

如果我执行pd.DataFrame(t [0],columns = ['CreationDate','Creator','ModDate','Producer']),则得到一个单行DataFrame。 如何做一个好的DataFrame? 谢谢。

2 个答案:

答案 0 :(得分:1)

您可以从列表理解的嵌套列表中选择第一个列表:

df = pd.DataFrame([x[0] for x in t])

或展平嵌套列表,然后获取所有嵌套列表:

df = pd.DataFrame([y for x in t for y in x])

print (df)
                 CreationDate                          Creator  \
0  b"D:20191125142104+05'00'"  b'PDF-XChange Editor 7.0.325.1'   
1  b"D:20200215153643+05'00'"         b'Adobe Acrobat 11.0.23'   
2  b"D:20200215153532+05'00'"         b'Adobe Acrobat 11.0.23'   

                      ModDate  \
0  b"D:20191125142754+05'00'"   
1  b"D:20200215191411+05'00'"   
2  b"D:20200215191426+05'00'"   

                                            Producer  
0            b'PDF-XChange Core API SDK (7.0.325.1)'  
1  b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug...  
2  b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug...  

答案 1 :(得分:1)

使用concat和from_dict:

Group