树图路径中二进制字符串的表示

时间:2017-11-18 16:33:35

标签: python algorithm list pandas networkx

我正在尝试编写一种算法,在该算法中搜索图表以寻找可能的节点路径,表示二进制字符串。其中具有偶数的节点对应于数字'0',而奇数数字'1'。以下代码暂时不优雅且未经优化。在代码评论中,我对他的行为做了一些解释。

import networkx as nx
import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("graph.csv", sep=';', encoding='utf-8')
df1=df.astype(int)

g = nx.Graph()
g = nx.from_pandas_dataframe(df1, 'nodes_1', 'nodes_2')

plt.show()
# I load any binary string.
# Example '01'
z = input('Write a binary number. \n')

z1=list(z)
l1 = df1['nodes_2'].tolist()

# I add to the list '0', because in df1 ['nodes_2'] the node '0' is missing.
l1[:0] = [0]

# I check whether the first digit entered in the input() of the variable 'z' is 0 or 1.
# And with good values I create a list of 'a'.
a=[]

if int(z1[0])==0:
   for i in l1:
       if i%2==0:
           num1 = int(i)
           a.append(num1)

elif int(z1[0])==1: 
     for i in l1:
        if i%2 ==1:
           num1 = int(i)
           a.append(num1)

else: print('...')

# I am creating 'b' list of neighbors lists for nodes from list 'a'.
b=[]
c=[]

for i in a:
    c.append(i)
    x4 = g.neighbors(i)
    b.append(x4)

# For neighbors I choose only those that are odd in this case,
# because the second digit from the entered 'z' is 1, 
# and then I create a list of 'e' matching pairs representing the possible graph paths.
e=[]

if int(z1[1])==0:
   for j in range(len(b)):
       for k in range(len(b[j])):
          if b[j][k]%2==0:
             d = [a[j], b[j][k]]
             e.append(d)

elif int(z1[1])==1: 
     for j in range(len(b)):
         for k in range(len(b[j])):
             if b[j][k]%2==1:
                d = [a[j], b[j][k]]
                e.append(d)

print (a)
# Output: 
# [0, 2, 4, 6, 8, 10, 12, 14]
print (b)
# Output: 
# [[1, 2], [0, 5, 6], [1, 9, 10], [2, 13, 14], [3], [4], [5], [6]]
print (e)
# Output: 
# [[0, 1], [2, 5], [4, 1], [4, 9], [6, 13], [8, 3], [12, 5]]

csv数据格式:

    nodes_1 nodes_2
0   0       1
1   0       2
2   1       3
3   1       4
4   2       5
5   2       6
6   3       7
7   3       8
8   4       9
9   4       10
10  5       11
11  5       12
12  6       13
13  6       14

目前,我在调整要在任何长二进制字符串上使用的代码时遇到问题。因为在上面的例子中,只能使用2位字符串。因此,我将非常感谢有关简化和自定义代码的任何提示。

1 个答案:

答案 0 :(得分:2)

所有代码都可以简化为几行,我的意思是可以进行矢量化,所以你可以摆脱for循环,即

a = pd.Series([0] + df['nodes_2'][df['nodes_2']%2==0].values.tolist())

# Creating series to make use of apply 
b = a.apply(g.neighbors)

n1e ,n2e  = df['nodes_1'] % 2 == 0, df['nodes_2'] % 2 == 0
n1o ,n2o = df['nodes_1'] % 2 == 1, df['nodes_2'] % 2 == 1

# Now you want either the nodes_1 be to odd or nodes_2 to be odd but not both, same for even. 
# Use that as a boolean mask for selecting the data 
e = df[~((n1e == n2e) & (n1o == n2o))]

输出:

a.values.tolist()
[0, 2, 4, 6, 8, 10, 12, 14]

b.values.tolist()
[[1, 2], [0, 5, 6], [1, 10, 9], [2, 13, 14], [3], [4], [5], [6]]

e.values.tolist()
[[0, 1], [1, 4], [2, 5], [3, 8], [4, 9], [5, 12], [6, 13]]

您可以获取vectroized代码并将其置于用户给出的相应条件(布尔值)下。

根据条件更新e以保持结尾甚至开头的奇数

e = [[i[0],i[1]] if i[0]%2 == 0 else [i[1],i[0]] for i in e ]
e = pd.DataFrame(e).sort_values(0).values.tolist()

[[0, 1], [2, 5], [4, 1], [4, 9], [6, 13], [8, 3], [12, 5]]