用字典和对快速查询系列熊猫字典地图系列

时间:2020-06-15 22:08:31

标签: python pandas dictionary series lookup-tables

我有一个距离矩阵 A ==>直接==> B ... Z

A ==>通过ALPHA ==> B ... Z

B ==>直接==> C..Z

我创建了一个字典,其工作方式如下:

#distances is populated with the distance value above
distances = pd.DataFrame.from_dict({ 'From' : ['A','A','A','B','B','C','C'],
                                  'via': ['d','s','d','d','d','d','s'],
                                  'To' : ['B','C','D','C','D','E','F']
                                  'Distance': [10,5,12,4,3,22,21]})
distances_dict = distances.set_index(['From', 'via', 'To']).to_dict('index')
new_distances = dict()
for key in distances_dict.keys():
        new_distances.update({key: distances_dict[key]['Distance']})
print(new_distances['A', 'd', 'B'])

我有一个熊猫df(1,000,000行),在这里我要计算每一行的距离,但是我将使用与上面相同的方法来简化。

a = distances
a['map'] = "'"+a['From']+"'"+",'"+a['via']+"',"+"'"+a['To']+"'"
a['Check Distance'] = a['map'].map(new_distances)
#yields NaN

有没有办法做到这一点?我正在寻找相对大规模的字符串查找

1 个答案:

答案 0 :(得分:0)

你可以试试吗?

a['Check Distance'] = a.apply(lambda x: distances_dict[(x['From'], x['via'], x['To'])]['Distance'],axis=1)