从字符串转换为数字

时间:2017-10-15 03:39:32

标签: python string base number-theory

所以,我正在尝试编写一个程序来解码6个字符的base-64数字。

以下是问题陈述:

  

以6个字符的字符串s返回以相反顺序表示为base-64数字的36位数字,其中64个数字的顺序为:0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz - +

即。

  

解码(' 000000')→0

     

解码(' gR1iC9')→9876543210

     

解码(' ++++++')→68719476735

我想在没有字符串的情况下这样做。

最简单的方法是创建以下函数的反函数:

def get_digit(d):
    ''' Convert a base 64 digit to the desired character '''
    if 0 <= d <= 9:
        # 0 - 9
        c = 48 + d
    elif 10 <= d <= 35:
        # A - Z
        c = 55 + d
    elif 36 <= d <= 61:
        # a - z
        c = 61 + d
    elif d == 62:
        # -
        c = 45
    elif d == 63:
        # +
        c = 43
    else:
        # We should never get here
        raise ValueError('Invalid digit for base 64: ' + str(d)) 
    return chr(c)

# Test `digit`
print(''.join([get_digit(d) for d in range(64)]))

def encode(n):
    ''' Convert integer n to base 64 '''
    out = []
    while n:
        n, r = n // 64, n % 64
        out.append(get_digit(r))
    while len(out) < 6:
        out.append('0')
    return ''.join(out)

# Test `encode`
for i in (0, 9876543210, 68719476735):
    print(i, encode(i))

输出

0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+
0 000000
9876543210 gR1iC9
68719476735 ++++++

实际上是this页面上的PM 2Ring。

如何编写此程序的反转?

一个开始:

如上所述,get_digits的反函数如下:

def inv_get_digit(c):

    if 0 <= c <= 9:
        d = ord(c) - 48
    elif 'A' <= c <= 'Z':
        d = ord(c) - 55
    elif 'a' <= c <= 'z'
        d = ord(c) - 61
    elif c == '+':
        d = 63
    elif c == '-':
        d = 62
    else:
        raise ValueError('Invalid Input' + str(c))
    return d


def decode(n):

    out = []
    while n:
        n, r= n % 10, n ** (6-len(str))
        out.append(get_digit(r))
    while len(out) < 10:
        out.append('0')
    return ''.join(out)

2 个答案:

答案 0 :(得分:1)

这是一个将my old code与一些新代码结合起来执行逆操作的程序。

inv_get_digit函数中存在语法错误:您将冒号留在elif行的末尾。并且不需要执行str(c),因为c已经是字符串。

我担心你的decode功能没有多大意义。它应该将一个字符串作为输入并返回一个整数。请参阅下面的工作版本。

def get_digit(d):
    ''' Convert a base 64 digit to the desired character '''
    if 0 <= d <= 9:
        # 0 - 9
        c = 48 + d
    elif 10 <= d <= 35:
        # A - Z
        c = 55 + d
    elif 36 <= d <= 61:
        # a - z
        c = 61 + d
    elif d == 62:
        # -
        c = 45
    elif d == 63:
        # +
        c = 43
    else:
        # We should never get here
        raise ValueError('Invalid digit for base 64: ' + str(d)) 
    return chr(c)

print('Testing get_digit') 
digits = ''.join([get_digit(d) for d in range(64)])
print(digits)

def inv_get_digit(c):
    if '0' <= c <= '9':
        d = ord(c) - 48
    elif 'A' <= c <= 'Z':
        d = ord(c) - 55
    elif 'a' <= c <= 'z':
        d = ord(c) - 61
    elif c == '-':
        d = 62
    elif c == '+':
        d = 63
    else:
        raise ValueError('Invalid input: ' + c)
    return d

print('\nTesting inv_get_digit') 
nums = [inv_get_digit(c) for c in digits]
print(nums == list(range(64)))

def encode(n):
    ''' Convert integer n to base 64 '''
    out = []
    while n:
        n, r = n // 64, n % 64
        out.append(get_digit(r))
    while len(out) < 6:
        out.append('0')
    return ''.join(out)

print('\nTesting encode')
numdata = (0, 9876543210, 68719476735)
strdata = []
for i in numdata:
    s = encode(i)
    print(i, s)
    strdata.append(s)

def decode(s):
    out = []
    n = 0
    for c in reversed(s):
        d = inv_get_digit(c)
        n = 64 * n + d
    return n

print('\nTesting decode')
for s, oldn in zip(strdata, numdata):
    n = decode(s)
    print(s, n, n == oldn)

<强>输出

Testing get_digit
0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz-+

Testing inv_get_digit
True

Testing encode
0 000000
9876543210 gR1iC9
68719476735 ++++++

Testing decode
000000 0 True
gR1iC9 9876543210 True
++++++ 68719476735 True

答案 1 :(得分:0)

  

我想在没有字符串的情况下完成此操作。

首先,您需要弄清楚这意味着什么。您提供的有效编码器使用以下字符串:

out.append('0')
return ''.join(out)

可接受的解决方案会添加以下字符串:

digits = ''.join([get_digit(d) for d in range(64)])
if '0' <= c <= '9':
elif 'A' <= c <= 'Z':
elif 'a' <= c <= 'z':
elif c == '-':
elif c == '+':

您是说单个字符串是可接受的,但多个字符串却不是吗?还是您是说您不想使用str作为数据结构并希望最大程度地减少字符串操作?

我觉得您的解决方案以及以该解决方案为基础的公认解决方案在编码和解码时进行了太多操作。我建议先做一些工作来构建数据结构,并在处理数据时减少工作量:

from string import digits, ascii_lowercase, ascii_uppercase

BASE10_TO_BASE64 = list(digits + ascii_uppercase + ascii_lowercase + '-' + '+')

BASE64_TO_BASE10 = {base64: base10 for base10, base64 in enumerate(BASE10_TO_BASE64)}

ZEROS = ['0'] * 6

def encode(number):
    ''' Convert base 10 int to reversed base 64 str '''

    characters = []

    while number:
        number, remainder = divmod(number, 64)
        characters.append(BASE10_TO_BASE64[remainder])

    return ''.join(characters + ZEROS[:max(len(ZEROS) - len(characters), 0)])

def decode(string):
    ''' Convert reversed base 64 str to base 10 int '''

    number = 0

    for character in string[::-1]:
        digit = BASE64_TO_BASE10[character]
        number = 64 * number + digit

    return number

if __name__ == "__main__":

    NUMBERS = (4096, 9876543210, 68719476735)
    strings = []

    print("Encode:")
    for number in NUMBERS:
        string = encode(number)
        print(number, string)
        strings.append(string)

    print("\nDecode:")
    for string in strings:
        number = decode(string)
        print(string, number)

反转为基数的64位数字编码和零填充使程序复杂化,但没有添加任何内容。通常,我们期望'100'代表底数的平方,但是在这里不是。

输出

> python3 test.py
Encode:
4096 001000
9876543210 gR1iC9
68719476735 ++++++

Decode:
001000 4096
gR1iC9 9876543210
++++++ 68719476735
>