从数据框python创建自定义字典时出现字符串索引错误

时间:2018-10-16 06:50:13

标签: python pandas dictionary dataframe

我正在尝试从数据框创建字典,下面是数据框和代码:

Code | Desc
XS   | Train
XS   | Car
SE   | Cycle
SE   | Train

下面是我的代码

lst_code = 'NA'
comp_list=[]
comp_dict = {}
for row in test_df:
    if str(row['code']) != lst_code:
        lst_code = row['code']
        if comp_list:
            comp_dict.update(lst_code,comp_list)
    else:
        comp_list.append(row['desc'])

使用上面的代码,我得到以下错误

if str(row['analyst_code']) != lst_code:
TypeError: string indices must be integers

我期望下面的字典:

comp_dict = {'XS':['Train','Car'],
          'SE':['Cycle','Train']}

请提出建议,我该如何解决?

1 个答案:

答案 0 :(得分:2)

首先按boolean indexing过滤,然后按GroupBy.size每组计数,最后转换Series to_dict

lst_code = 'NA'
comp_dict = df[df['Code'] != lst_code].groupby('Code')['Desc'].apply(list).to_dict()
print (comp_dict)
{'SE': ['Cycle', 'Train'], 'XS': ['Train', 'Car']}

如果不需要过滤:

comp_dict = df.groupby('code')['Desc'].apply(list).to_dict()