大熊猫:根据字典的逆映射创建一列

时间:2019-05-10 14:25:55

标签: python pandas

我有一个带有公司名称的数据框和一个字典,该字典将名称的所有变体映射到一个正式名称。

我想基于该词典创建一个具有正式名称的新列。是否有比遍历字典中的键值更简洁的方法?

df = pd.DataFrame({'name' : ['company a', 'company a inc', 'a electronics', 'company a ltd', 'the company a', 'b enterprises', 'company b']})

name_dict = {'company a' : ['company a', 'company a inc', 'a electronics', 'company a ltd', 'the company a'],
'company b' : ['b enterprises', 'company b']}

def get_company_name(name):
    for k, v in name_dict.items():
        if name in v:
            return k

df['official_name'] = df.name.apply(get_company_name)

4 个答案:

答案 0 :(得分:2)

我将创建转发字典并替换:

forward_names = {v:k  for k, val in name_dict.items() for v in val }
df['official_name'] = df['name'].replace(forward_names)

答案 1 :(得分:0)

我只浏览name_dict目录以建立数据框的行:

df = pd.DataFrame([[v,k] for k in name_dict for v in name_dict[k]],
                  columns = ['name', 'official_name'])

答案 2 :(得分:0)

解决方案1:

def get_company_name(name):
    return [k for k, v in name_dict.items() if name in v][0]

df['official_name'] = df.name.apply(get_company_name)
print (df)

解决方案2:

df['official_name'] = df.name.apply(lambda name: list(k for k, v in name_dict.items() if name in v)[0])
print (df)

输出:

            name official_name
0      company a     company a
1  company a inc     company a
2  a electronics     company a
3  company a ltd     company a
4  the company a     company a
5  b enterprises     company b
6      company b     company b

答案 3 :(得分:0)

我会将name_dict放入数据框,然后融化然后合并:

df2 = pd.DataFrame.from_dict(name_dict, orient='index')
df2 = df2.transpose()
df2 = df2.melt()
df3 = df.merge(df2, how='left', left_on='name', right_on='value', sort=False)