Question

我正在寻找一种迭代生成de Bruijn序列而不是递归的方法。我的目标是逐个字符地生成它。

我发现some example code in Python用于生成de Bruijn序列，并将其翻译为Rust。我还不能很好地理解这种技术，以创建自己的方法。

翻译成Rust：

type A = { a: string };
type B = { b: string };

const x = {
  AA: undefined as A | undefined,
  BB: undefined as B | undefined,
}

const name1 = "AA"
const x1 = x[name1]; // type: A | undefined
x[name1] = x1;

const name2 = "AA" as "AA"|"BB";
const x2 = x[name2]; // type: A | B | undefined
x[name2] = x2; //ERROR

//Type 'A | B | undefined' is not assignable to type '(A & B) | undefined'.
//  Type 'A' is not assignable to type 'A & B'.
//    Property 'b' is missing in type 'A' but required in type 'B'.

但是，这无法迭代生成-它经历了整个递归和迭代过程，无法解开为单个状态。

Answer 1

我对Rust并不熟悉，所以我用Python对其进行了编程和测试。既然发帖人从Python程序翻译了问题中的版本，我希望这不会成为大问题。

if (TRUE) { 
  ggplot() 
  ggave(filename = "plot.pdf")
}

我通过使用原始版本# the following function treats list a as # k-adic number with n digtis # and increments this number returning # the index of the leftmost digit changed def increment_a7(a, k, n): digit= n-1 a[digit]+= 1 while a[digit] >= k and digit> 0: #a[digit]= 0 a[digit]= a[0]+1 a[digit-1]+= 1 digit-= 1 return digit # the following function adds a to the sequence # and takes into account, that the beginning of a # could overlap with the end of sequence # in that case, it just removes the overlapping digits # from a before adding the remaining digits to sequence def append_to_sequence(sequence, a, n): # here we can assume safely, that a # does not overlap completely with sequence[-n:] i= -1 for i in range(n-1, -1, -1): found= True # check if the last i digits in sequence # overlap with the first i digits in a for j in range(i): if a[j] != sequence[-i+j]: # no, they don't overlap found= False break if found: # yes they overlap, so no need to # continue the check with a smaller i break # now we can just append everything from # digit i (digit 0 - i-1 are swallowed) sequence.extend(a[i:]) return n-i # during the operation we have to keep track of # the k-adic numbers a, that already occured in # the sequence. We store them in a set called used # everytime we add something to the sequence # we have to update it and add one entry for each # digit inserted def update_used(sequence, used, n, num_inserted): l= len(sequence) for i in range(num_inserted): used.add(tuple(sequence[-n-i:l-i])) # the main work is done in the following function # it creates and returns the generated sequence def gen4(k, n): a= [0]*n sequence= a[:] used= set() # create a fake sequence to add the segments obtained by the cyclic nature fake= ([k-1] * (n-1)) for i in range(n-1): fake.append(0) update_used(fake, used, n, 1) update_used(sequence, used, n, 1) valid= True while valid: # a is still a valid k-adic number # this means the generation process # has not ended # so construct a new number from the n-1 # last digits of sequence # followed by a zero a= sequence[-n+1:] a.append(0) while valid and tuple(a) in used: # the constructed k-adict number a # was already used, so increment it # and try again increment_a(a, k, n) valid= a[0]<k if valid: # great, the number is still valid # and is not jet part of the sequence # so add it after removing the overlapping # digits and update the set with the segments # we already used num_inserted= append_to_sequence(sequence, a, n) update_used(sequence, used, n, num_inserted) return sequence生成了一些序列并使用相同的参数来测试了上面的代码。对于我测试的所有参数集，两个版本的结果都相同。

请注意，此代码的效率比原始版本低，尤其是序列较长时。我想设置操作的成本会对运行时间产生非线性影响。

如果愿意，您可以进一步改进它，例如使用一种更有效的方法来存储使用的段。除了使用k-adic表示形式（a-list）之外，您还可以使用多维数组。

Answer 2

考虑这种方法：

从每个项链类别中选择第一个（按字典顺序）代表

Here is Python code用于生成包含d条项链的（ binary ）项链的代表（可以重复所有d值）。 Sawada article link
按字典顺序对代表进行排序
对每个代表进行周期性缩减（如果可能的话）：如果字符串是周期性的s = p^m（如010101，则选择01

要找到句点，可以使用string doubling或z-algorithm（我希望编译语言的速度更快）
级联缩减

n = 3，k = 2的示例：
代表排序：000, 001, 011, 111
减少量：0, 001, 011, 1
结果：00010111

JörgArndt的书"Matters Computational"第18章中描述了相同的基本方法（带有C代码）

wiki

中提到了类似的方法

另一种构造方式是在字典顺序，所有长度为n的Lyndon单词

您可能会寻找有效的方法来生成适当的Lyndon单词

如何迭代生成de Bruijn序列？

2 个答案: