我有列表(t):
[[{'CreationDate': b"D:20191125142104+05'00'",
'Creator': b'PDF-XChange Editor 7.0.325.1',
'ModDate': b"D:20191125142754+05'00'",
'Producer': b'PDF-XChange Core API SDK (7.0.325.1)'}],
[{'CreationDate': b"D:20200215153643+05'00'",
'Creator': b'Adobe Acrobat 11.0.23',
'ModDate': b"D:20200215191411+05'00'",
'Producer': b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug-in'}],
[{'CreationDate': b"D:20200215153532+05'00'",
'Creator': b'Adobe Acrobat 11.0.23',
'ModDate': b"D:20200215191426+05'00'",
'Producer': b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug-in'}]]
我需要创建一个DataFrame,其中column = ['CreationDate','Creator','ModDate','Producer']。 我尝试下一步:pd.DataFrame(t,columns = ['CreationDate','Creator','ModDate','Producer'])),我得到了错误:
4 columns passed, passed data had 1 columns
如果我执行pd.DataFrame(t [0],columns = ['CreationDate','Creator','ModDate','Producer']),则得到一个单行DataFrame。 如何做一个好的DataFrame? 谢谢。
答案 0 :(得分:1)
您可以从列表理解的嵌套列表中选择第一个列表:
df = pd.DataFrame([x[0] for x in t])
或展平嵌套列表,然后获取所有嵌套列表:
df = pd.DataFrame([y for x in t for y in x])
print (df)
CreationDate Creator \
0 b"D:20191125142104+05'00'" b'PDF-XChange Editor 7.0.325.1'
1 b"D:20200215153643+05'00'" b'Adobe Acrobat 11.0.23'
2 b"D:20200215153532+05'00'" b'Adobe Acrobat 11.0.23'
ModDate \
0 b"D:20191125142754+05'00'"
1 b"D:20200215191411+05'00'"
2 b"D:20200215191426+05'00'"
Producer
0 b'PDF-XChange Core API SDK (7.0.325.1)'
1 b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug...
2 b'Adobe Acrobat Pro 11.0.23 Paper Capture Plug...
答案 1 :(得分:1)
使用concat和from_dict:
Group