我正在尝试从数据框创建字典,下面是数据框和代码:
Code | Desc
XS | Train
XS | Car
SE | Cycle
SE | Train
下面是我的代码
lst_code = 'NA'
comp_list=[]
comp_dict = {}
for row in test_df:
if str(row['code']) != lst_code:
lst_code = row['code']
if comp_list:
comp_dict.update(lst_code,comp_list)
else:
comp_list.append(row['desc'])
使用上面的代码,我得到以下错误
if str(row['analyst_code']) != lst_code:
TypeError: string indices must be integers
我期望下面的字典:
comp_dict = {'XS':['Train','Car'],
'SE':['Cycle','Train']}
请提出建议,我该如何解决?
答案 0 :(得分:2)
首先按boolean indexing
过滤,然后按GroupBy.size
每组计数,最后转换Series
to_dict
:
lst_code = 'NA'
comp_dict = df[df['Code'] != lst_code].groupby('Code')['Desc'].apply(list).to_dict()
print (comp_dict)
{'SE': ['Cycle', 'Train'], 'XS': ['Train', 'Car']}
如果不需要过滤:
comp_dict = df.groupby('code')['Desc'].apply(list).to_dict()