两个问题 :
Example 1:
DistMatrix = [[ 'a', 'b', 'c', 'd'],
['a', 0, 0.3, 0.4, 0.1],
['b', 0.3, 0, 0.9, 0.2],
['c', 0.4, 0.9, 0, 0.7],
['d', 0.1, 0.2, 0.7, 0]]
states are a,b,c,d. I want to find the value (threshold) that allow to go from a to d (no matter if other states are walked)
Naive approach:
- first loop: threshold 0.9, I get rid of lesser probabilities: I can only connect c and b
- second loop: threshold 0.7, I get rid of lesser probabilities: I can only connect c, b, d
- third loop: threshold 0.4, I get rid of lesser probabilities: I can connect a,c, b, d: here is my threshold: 0.4
- >一旦我的转换矩阵有数千个状态,那应该是非常复杂的吗? - >算法提出?
Example 2:
DistMatrix =
[ 'a', 'b', 'c', 'd'],
['a', 0, 0.3, 0.4, 0.7],
['b', 0.3, 0, 0.9, 0.2],
['c', 0.4, 0.9, 0, 0.1],
['d', 0.7, 0.2, 0.1, 0] ]
states are a,b,c,d. I want to find the value (threshold) that allow to go from a to d (no matter if other states are walked)
Naive approach:
-first loop: threshold 0.9, I get rid of all others: I can only connect c and b
-second loop: threshold 0.7, I get rid of lesser connexion: I connect b and c, and a and d but because a and d are connected, I have my threshold!
答案 0 :(得分:6)
DistMatrix1 = np.array([[0, 0.3, 0.4, 0.1],
[0.3, 0, 0.9, 0.2],
[0.4, 0.9, 0, 0.7],
[0.1, 0.2, 0.7, 0]])
DistMatrix2 = np.array([[0, 0.3, 0.4, 0.7],
[0.3, 0, 0.9, 0.2],
[0.4, 0.9, 0, 0.1],
[0.7, 0.2, 0.1, 0]])
获取距离矩阵中所有概率的排序数组。然后,按照E先生的建议执行标准二进制搜索。在二进制搜索的每个步骤中,如果它们低于当前概率,则将矩阵中的条目替换为0。在图表上运行广度优先搜索,启动第一个节点,然后查看是否到达最后一个节点。如果这样做,则阈值更高,否则阈值更低。 bfs代码实际上是从NetworkX版本改编的。
import numpy as np
def find_threshold_bfs(array):
first_node = 0
last_node = len(array) - 1
probabilities = np.unique(array.ravel())
low = 0
high = len(probabilities)
while high - low > 1:
i = (high + low) // 2
prob = probabilities[i]
copied_array = np.array(array)
copied_array[copied_array < prob] = 0.0
if bfs(copied_array, first_node, last_node):
low = i
high = i
return probabilities[low]
def bfs(graph, source, dest):
"""Perform breadth-first search starting at source. If dest is reached,
return True, otherwise, return False."""
# Based on http://www.ics.uci.edu/~eppstein/PADS/BFS.py
# by D. Eppstein, July 2004.
visited = set([source])
nodes = np.arange(0, len(graph))
stack = [(source, nodes[graph[source] > 0])]
while stack:
parent, children = stack[0]
for child in children:
if child == dest:
return True
if child not in visited:
stack.append((child, nodes[graph[child] > 0]))
return False
import networkx as nx
import numpy as np
def find_threshold_nx(array):
"""Return the threshold value for adjacency matrix in array."""
first_node = 0
last_node = len(array) - 1
probabilities = np.unique(array.ravel())
low = 0
high = len(probabilities)
while high - low > 1:
i = (high + low) // 2
prob = probabilities[i]
copied_array = np.array(array)
copied_array[copied_array < prob] = 0.0
graph = nx.from_numpy_matrix(copied_array)
if nx.has_path(graph, first_node, last_node):
low = i
high = i
return probabilities[low]
NetworkX版本在具有超过一千个节点的图表上崩溃(在我的笔记本电脑上)。 bfs版本可以轻松找到几千个节点的图形阈值。
In [5]: from percolation import *
In [6]: print('Threshold is {}'.format(find_threshold_bfs(DistMatrix1)))
Threshold is 0.4
In [7]: print('Threshold is {}'.format(find_threshold_bfs(DistMatrix2)))
Threshold is 0.7
In [10]: big = np.random.random((6000, 6000))
In [11]: print('Threshold is {}'.format(find_threshold_bfs(big)))
Threshold is 0.999766933071
对于时间安排,我得到(在最近的Macbook Pro上):
In [5]: smaller = np.random.random((100, 100))
In [6]: larger = np.random.random((800, 800))
In [7]: %timeit find_threshold_bfs(smaller)
100 loops, best of 3: 11.3 ms per loop
In [8]: %timeit find_threshold_nx(smaller)
10 loops, best of 3: 94.9 ms per loop
In [9]: %timeit find_threshold_bfs(larger)
1 loops, best of 3: 207 ms per loop
In [10]: %timeit find_threshold_nx(larger)
1 loops, best of 3: 6 s per loop
答案 1 :(得分:0)