我想生成长度为n = 128的二进制字符串,其特性是任何这样的字符串对至少都在d = 10汉明距离内。
为此,我尝试使用最小距离d = 10的纠错码(ECC)。但是,我找不到任何具有128位字长的代码字的ecc。如果代码字长(n)和d比128和10小/大一点,那对我仍然有效。
是否存在具有此(相似)属性的ecc?有python的实现吗?
答案 0 :(得分:1)
里德穆勒码RM(3,7)具有:
首先构建这样的基础:
def popcnt(x):
return bin(x).count("1")
basis = []
by_ones = list(range(128))
by_ones.sort(key=popcnt)
for i in by_ones:
count = popcnt(i)
if count > 3:
break
if count <= 1:
basis.append(((1 << 128) - 1) // ((1 << i) | 1))
else:
p = ((1 << 128) - 1)
for b in [basis[k + 1] for k in range(7) if ((i >> k) & 1) != 0]:
p = p & b
basis.append(p)
然后,您可以使用它们的任何线性组合,这些组合是通过对基础行的子集进行XOR运算而创建的,例如:
def encode(x, basis):
# requires x < (1 << 64)
r = 0
for i in range(len(basis)):
if ((x >> i) & 1) != 0:
r = r ^ basis[i]
return r
在其他一些实现中,我发现这是通过将具有基矩阵列的点乘积然后减少模2来完成的。我不知道为什么这样做,通过将a求和可以更直接地完成行的子集。
答案 1 :(得分:0)
我需要完全相同的东西。对我而言,幼稚的方法效果很好!只需生成随机的位字符串并检查它们之间的汉明距离,即可逐步建立满足要求的字符串列表:
def random_binary_array(width):
"""Generate random binary array of specific width"""
# You can enforce additional array level constraints here
return np.random.randint(2, size=width)
def hamming2(s1, s2):
"""Calculate the Hamming distance between two bit arrays"""
assert len(s1) == len(s2)
# return sum(c1 != c2 for c1, c2 in zip(s1, s2)) # Wikipedia solution
return np.count_nonzero(s1 != s2) # a faster solution
def generate_hamm_arrays(n_values, size, min_hamming_dist=5):
"""
Generate a list of binary arrays ensuring minimal hamming distance between the arrays.
"""
hamm_list = []
while len(hamm_list) < size:
test_candidate = random_binary_array(n_values)
valid = True
for word in hamm_list:
if (word == test_candidate).all() or hamming2(word, test_candidate) <= min_hamming_dist:
valid = False
break
if valid:
hamm_list.append(test_candidate)
return np.array(hamm_list)
print(generate_hamm_arrays(16, 10))
输出:
[[0 0 1 1 0 1 1 1 0 1 0 1 1 1 1 1]
[1 0 1 0 0 1 0 0 0 1 0 0 1 0 1 1]
[1 1 0 0 0 0 1 0 0 0 1 1 1 1 0 0]
[1 0 0 1 1 0 0 1 1 0 0 1 1 1 0 1]
[0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 1]
[1 1 0 0 0 0 0 1 0 1 1 1 0 1 1 1]
[1 1 0 1 0 1 0 1 1 1 1 0 0 1 0 0]
[0 1 1 1 1 1 1 0 0 0 1 1 0 0 0 0]
[1 1 0 0 0 0 1 1 1 0 0 1 0 0 0 1]
[0 1 0 1 1 0 1 1 1 1 1 1 1 1 1 0]]
并且只要您不想要非常密集的字符串列表(字符串中的少量位+大汉明距离),它就不会太慢。根据您的规范(汉明距离为10的128位字符串没问题),我们可以在0.2秒内以非常弱的CPU生成1000位字符串:
import timeit
timeit.timeit(lambda: generate_hamm_arrays(n_values=128, size=100, min_hamming_dist=10), number=10)
>> 0.19202665984630585
希望这种解决方案对您也足够。
答案 2 :(得分:0)
我的O(n * n!)解决方案(在N <14的合理时间内工作)
def hammingDistance(n1, n2):
return bin(np.bitwise_xor(n1, n2)).count("1")
N = 10 # binary code of length N
D = 6 # with minimum distance D
M = 2**N # number of unique codes in general
# construct hamming distance matrix
A = np.zeros((M, M), dtype=int)
for i in range(M):
for j in range(i+1, M):
A[i, j] = hammingDistance(i, j)
A += A.T
def recursivly_find_legit_numbers(nums, codes=set()):
codes_to_probe = nums
for num1 in nums:
codes.add(num1)
codes_to_probe = codes_to_probe - {num1}
for num2 in nums - {num1}:
if A[num1, num2] < D:
"Distance isn't sufficient, remove this number from set"
codes_to_probe = codes_to_probe - {num2}
if len(codes_to_probe):
recursivly_find_legit_numbers(codes_to_probe, codes)
return codes
group_of_codes = {}
for i in tqdm(range(M)):
satisfying_numbers = np.where(A[i] >= D)[0]
satisfying_numbers = satisfying_numbers[satisfying_numbers > i]
nums = set(satisfying_numbers)
if len(nums) == 0:
continue
group_of_codes[i] = recursivly_find_legit_numbers(nums, set())
group_of_codes[i].add(i)
largest_group = 0
for i, nums in group_of_codes.items():
if len(nums) > largest_group:
largest_group = len(nums)
ind = i
print(f"largest group for N={N} and D={D}: {largest_group}")
print("Number of unique groups:", len(group_of_codes))
N = 10和D = 6的最大组:6个唯一组:992
# generate largest group of codes
[format(num, f"0{N}b") for num in group_of_codes[ind]]
['0110100001', '0001000010', '1100001100', '1010010111', '1111111010', '0001111101']