我有一个距离矩阵 A ==>直接==> B ... Z
A ==>通过ALPHA ==> B ... Z
B ==>直接==> C..Z
我创建了一个字典,其工作方式如下:
#distances is populated with the distance value above
distances = pd.DataFrame.from_dict({ 'From' : ['A','A','A','B','B','C','C'],
'via': ['d','s','d','d','d','d','s'],
'To' : ['B','C','D','C','D','E','F']
'Distance': [10,5,12,4,3,22,21]})
distances_dict = distances.set_index(['From', 'via', 'To']).to_dict('index')
new_distances = dict()
for key in distances_dict.keys():
new_distances.update({key: distances_dict[key]['Distance']})
print(new_distances['A', 'd', 'B'])
我有一个熊猫df(1,000,000行),在这里我要计算每一行的距离,但是我将使用与上面相同的方法来简化。
a = distances
a['map'] = "'"+a['From']+"'"+",'"+a['via']+"',"+"'"+a['To']+"'"
a['Check Distance'] = a['map'].map(new_distances)
#yields NaN
有没有办法做到这一点?我正在寻找相对大规模的字符串查找
答案 0 :(得分:0)
你可以试试吗?
a['Check Distance'] = a.apply(lambda x: distances_dict[(x['From'], x['via'], x['To'])]['Distance'],axis=1)