Question

我有许多以格式给出的列表

[{"@context":"ABC","entity":"PQR","URL":"abc@yahoo.com"}]
[{"@context":"RST","entity":"UVW","URL":"efg@gmail.com"}]
.............
............
............

我想将其转换为熊猫数据框：

@context    entity     URL
ABC         PQR        abc@yahoo.com
RST         UVW        efg@gmail.com
...         ...        .......
...         ...        .......

Answer 1

如果有嵌套列表，首先将其展平：

from  itertools import chain

L = [[{"@context":"ABC","entity":"PQR","URL":"abc@yahoo.com"}],
     [{"@context":"RST","entity":"UVW","URL":"efg@gmail.com"}]]

df = pd.DataFrame(list(chain.from_iterable(L)))

或者：

df = pd.DataFrame([y for x in L for y in x])

print (df)
  @context            URL entity
0      ABC  abc@yahoo.com    PQR
1      RST  efg@gmail.com    UVW

编辑：

如果数据是通过另一个脚本生成的，那么最好是创建所有词典的列表并传递给DataFrame构造函数：

L = [[{"@context":"ABC","entity":"PQR","URL":"abc@yahoo.com"}],
[{"@context":"RST","entity":"UVW","URL":"efg@gmail.com"}]]

L1 = []
for i in L:
    print (i[0])
    #simulate generate dictionaries
    L1.append(i[0])

print (L1)    
[{'@context': 'ABC', 'entity': 'PQR', 'URL': 'abc@yahoo.com'}, 
 {'@context': 'RST', 'entity': 'UVW', 'URL': 'efg@gmail.com'}]


df = pd.DataFrame(L1)
print (df)
  @context            URL entity
0      ABC  abc@yahoo.com    PQR
1      RST  efg@gmail.com    UVW

编辑：

问题是您的数据是字符串，因此首先需要将它们转换为字典列表：

import ast

L = ['[{"@context":"ABC","entity":"PQR","URL":"abc@yahoo.com"}]',
     '[{"@context":"RST","entity":"UVW","URL":"efg@gmail.com"}]']

df = pd.DataFrame([y for x in L for y in ast.literal_eval(x)])
print (df)
  @context            URL entity
0      ABC  abc@yahoo.com    PQR
1      RST  efg@gmail.com    UVW

迭代列表内的键值对并转换为pandas数据框

1 个答案: