我正在尝试根据以下数组的每一行的顺序创建新的字符排序列表
array=np.array([['t', 'c', 'k', 's', 'x', 'f', 'b'],
['t', 'c', 'l', 'u', 's', 'z', 'f'],
['w', 't', 'l', 'u', 'k', 's', 'n']]
我希望我的新列表应该像['w' 't' 'c' 'l' 'u' 'k' 's'.....]
我的方法是写一个理解列表
myprev=set()
newalpha = [elem for row in array for elem in row if elem not in myprev and (myprev.add(elem) or True)]
但是在我的结果中,顺序不被遵守:在第三行中,w出现在前两个数组的t之前。因此,我希望w必须停留在列表的开头而不是列表中的结尾,如我的结果所示
['t', 'c', 'k', 's', 'x', 'f', 'b', 'l', 'u', 'z', 'w', 'n']
答案 0 :(得分:1)
我相信所要的是:
正如我在评论中所说,这并不总是可能的(例如,数组= [['a','b'],['b','a']]),如果可以,则不可能一定是一种独特的方式。在下面的代码中,我们根据字符首次出现在哪一行打破联系。如果根本没有解决方案,我不会对这段代码的行为做任何保证。
import numpy as np
array=[['t', 'c', 'k', 's', 'x', 'f', 'b'],
['t', 'c', 'l', 'u', 's', 'z', 'f'],
['w', 't', 'l', 'u', 'k', 's', 'n']]
# create the set of characters
# initially, this is sorted by the first row that the character is found in
# (and then by order within the row)
chset = set()
chars = list([ch for row in array for ch in row if ch not in chset and (chset.add(ch) or True)])
# array of comparisons
# a 1 in position i, j means chars[i] comes before chars[j]
# a -1 in position i, j means chars[j] comes before chars[i]
# a 0 in position i, j means we don't know yet, or i == j
# we should end with the only zeros being on the diagonal
comparisons = np.zeros((len(chars), len(chars)))
for row in array:
for i in range(len(row)):
i_index = chars.index(row[i])
for j in range(i+1, len(row)):
j_index = chars.index(row[j])
comparisons[i_index, j_index] = 1
comparisons[j_index, i_index] = -1
changes_made = True
while changes_made:
changes_made = False
# extend through transitivity:
# if we know chars[i] is before chars[k] is before chars[j], then chars[i] is before chars[j]
for i in range(len(chars)):
for j in range(i + 1, len(chars)):
if comparisons[i, j] == 0:
for k in range(len(chars)):
if comparisons[i, k] == 1 and comparisons[k, j] == 1:
comparisons[i, j] = 1
comparisons[j, i] = -1
changes_made = True
break
elif comparisons[i, k] == -1 and comparisons[k, j] == -1:
comparisons[i, j] = -1
comparisons[j, i] = 1
changes_made = True
break
if not changes_made:
# we've extended transitively as much as we can
# as a tiebreaker, use the first rows that chars[i] and chars[j] were found in
# which is the order chars is currently in
for i in range(len(chars)):
for j in range(i + 1, len(chars)):
if comparisons[i, j] == 0:
comparisons[i, j] = 1
comparisons[j, i] = -1
changes_made = True
break
if changes_made:
break
# convert the table of comparisons into a single number:
# the first character has -1s everywhere in its row, so gets the lowest score (-11, since there are 12 characters total)
# the second character has -1s everywhere except in the column corresponding to the first character, so gets score -9
# etc
scores = np.sum(comparisons, axis=0)
# sort chars by score
result = [pair[1] for pair in sorted(enumerate(chars), key=lambda pair: scores[pair[0]])]
print(result)