用于字符串生成的随机分布双射映射生成器函数

时间:2018-06-27 20:50:43

标签: python random mapping generator

我正在尝试创建一个python generator函数,以产生从1到38 5 的连续数字与字母数字随机分布的乘积之间的双射(内射和射出)映射(长度为5的28个小写字母+ 10个数字)
例如这样的东西:

1 -> 4fde6
2 -> grt74
3 -> g7w33
...

是否有用于此的模块?
对算法有任何想法吗?


编辑:我希望映射为:

  • 可使用生成器实现(不占用大量内存)
  • 在不同的代码运行期间保持不变
  • 尽可能均匀地分布

因此,我想用一句话Uniformly distributed constant bijective mapping between indexes and products of n alpha-numerics

谢谢

1 个答案:

答案 0 :(得分:1)

好吧,这是基于Linear Congruential Generator的从索引到5bytes字符串再往回的双射映射。 我为drand48使用了常数,对于40bits LCG似乎可以正常工作,种子1我测试了所有 2 40 值,并且已用完了。

它依赖于带掩码溢出的无符号64位数学运算,因此大量使用了NumPy。 包装和拆包以某种粗略的方式完成,许多事情有待改进。不要犹豫,问问题。

import numpy as np

# unsigned constants
ZERO  = np.uint64(0)
ONE   = np.uint64(1)

# signed constants
SZERO = np.int64(0)
SONE  = np.int64(1)
MONE  = np.int64(-1)

# LCG parameters
bits = np.uint64(40)
mult = np.uint64(25214903917)
incr = np.uint64(11)

mod  = np.uint64( np.left_shift(ONE, bits) )
mask = np.uint64(mod - ONE)

def rang(seed: np.uint64) -> np.uint64:
    """
    LCG mapping from one 40bit integer to another
    """
    return np.uint64(np.bitwise_and(mult*np.uint64(seed) + incr, mask))

def compute_nskip(nskp: np.int64) -> np.int64:
    nskip: np.int64 = nskp

    while nskip < SZERO:
        t: np.uint64 = np.uint64(nskip) + mod
        nskip = np.int64(t)

    return np.int64( np.bitwise_and(np.uint64(nskip), mask) )

def skip(nskp: np.int64, seed: np.uint64) -> np.uint64: # inverse mapping
    """
    Jump from given seed by number of skips
    """
    nskip: np.int64 = compute_nskip(nskp)

    m: np.uint64 = mult # original multiplicative constant
    c: np.uint64 = incr # original additive constant

    m_next: np.uint64 = ONE  #  new effective multiplicative constant
    c_next: np.uint64 = ZERO # new effective additive constant

    while nskip > SZERO:
        if np.bitwise_and(nskip, SONE) != SZERO: # check least significant bit for being 1
            m_next = np.bitwise_and(m_next * m, mask)
            c_next = np.bitwise_and(c_next * m + c, mask)

        c = np.bitwise_and((m + ONE) * c, mask)
        m = np.bitwise_and(m * m, mask)

        nskip = np.right_shift(nskip, SONE) # shift right, dropping least significant bit

    # with G and C, we can now find the new seed
    return np.bitwise_and(m_next * seed + c_next, mask)

def index2bytes(i: np.uint64) -> bytes:
    bbb: np.uint64 = rang(i)

    rc = bytearray()
    rc.append( np.uint8( np.bitwise_and(bbb, np.uint64(0xFF)) ) )
    bbb = np.right_shift(bbb, np.uint64(8))
    rc.append( np.uint8( np.bitwise_and(bbb, np.uint64(0xFF)) ) )
    bbb = np.right_shift(bbb, np.uint64(8))
    rc.append( np.uint8( np.bitwise_and(bbb, np.uint64(0xFF)) ) )
    bbb = np.right_shift(bbb, np.uint64(8))
    rc.append( np.uint8( np.bitwise_and(bbb, np.uint64(0xFF)) ) )
    bbb = np.right_shift(bbb, np.uint64(8))
    rc.append( np.uint8( np.bitwise_and(bbb, np.uint64(0xFF)) ) )

    return rc

def bytes2index(a: bytes) -> np.uint64:
    seed: np.uint64 = ZERO
    seed += np.left_shift( np.uint64(a[0]), np.uint64(0))
    seed += np.left_shift( np.uint64(a[1]), np.uint64(8))
    seed += np.left_shift( np.uint64(a[2]), np.uint64(16))
    seed += np.left_shift( np.uint64(a[3]), np.uint64(24))
    seed += np.left_shift( np.uint64(a[4]), np.uint64(32))

    return skip(MONE, seed)

# main part, silence overflow warnings first
np.warnings.filterwarnings('ignore')

bbb = index2bytes(ONE)
print(bbb)
idx = bytes2index(bbb)
print(idx)

bbb = index2bytes(999999)
print(bbb)
idx = bytes2index(bbb)
print(idx)

bbb = b'\xa4\x3c\xb1\xfc\x79'
idx = bytes2index(bbb)
print(idx)
bbb = index2bytes(idx)
print(bbb)
print(bbb == b'\xa4\x3c\xb1\xfc\x79')