我有这个:
[['COMPANY:', [('U S News & World Report Inc', 63)]],
['ORGANIZATION:',
[('Ashoka', 0),
('Innovators For The Public', 91),
('Us Environmental Protection Agency', 55)]]]
我希望这能成为像熊猫一样的数据框:
NAME ORGS PERCENT
Company US News & World Report 63
Organization Ashoka 0
Organization US Environmental Protection 55
答案 0 :(得分:2)
data = [['COMPANY:', [('U S News & World Report Inc', 63)]],
['ORGANIZATION:',
[('Ashoka', 0),
('Innovators For The Public', 91),
('Us Environmental Protection Agency', 55)]]]
results = []
for name, rest in data:
name = name.replace(":","").capitalize()
for orgs, percent in rest:
results.append( [name, orgs, percent] )
import pandas as pd
df = pd.DataFrame( results, columns=['NAME', 'ORGS', 'PERCENT'])
print df
结果:
NAME ORGS PERCENT
0 Company U S News & World Report Inc 63
1 Organization Ashoka 0
2 Organization Innovators For The Public 91
3 Organization Us Environmental Protection Agency 55
答案 1 :(得分:1)
这是一个你应该研究的from_dict
读取方法,这种方法恰好适用于这种情况,只需要将列表转换为字典:
L=[['COMPANY:', [('U S News & World Report Inc', 63)]],
['ORGANIZATION:',
[('Ashoka', 0),
('Innovators For The Public', 91),
('Us Environmental Protection Agency', 55)]]]
In [160]:
df=pd.DataFrame.from_dict(dict(L), orient="index").stack().reset_index(level=0)
df['Name']=df[0].apply(lambda x: x[0])
df['Val']=df[0].apply(lambda x: x[1])
df['Type']=df.level_0.str.slice(stop=-1)
df.__delitem__(0)
df.__delitem__('level_0')
In [161]:
print df
Name Val Type
0 Ashoka 0 ORGANIZATION
1 Innovators For The Public 91 ORGANIZATION
2 Us Environmental Protection Agency 55 ORGANIZATION
0 U S News & World Report Inc 63 COMPANY