python:如何在没有空格的情况下提取字符串中不同长度的连续切片

时间:2016-10-25 16:10:06

标签: python python-3.x

例如,在789101112

的字符串中
i = 1 
j = 0 
x = '789101112'  
while i < 3:
    while j < len(x):
        m = int(x[j: j+i])
        n = int(x[j+i: j+i+i])
        if n - m ==1:
            print(m, n)
            j +=i

输出结果为:

7 8
8 9

我想要的输出是:

7 8
8 9
9 10
10 11
11 12

根据我的代码,我需要做什么?

2 个答案:

答案 0 :(得分:0)

我对算法有一些粗略的想法,但我想我花了比预期更多的时间来编写示例程序。

注意:我假设系列中缺失连续数字的最大数量为1.您可以修改代码并应用循环以将最大计数视为无限或您希望的任何其他数字

它过了午夜,所以代码没有经过彻底的测试。这是:

def get_first_num(string):
    first_num=0
    for i in range(int(len(string)/2)):
        if string.startswith(string[0:i+1]+str(int(string[0:i+1])+1)) or string.startswith(string[0:i+1]+str(int(string[0:i+1])+2)):    #max difference between any 2 consecutive numbers must be 1 or 2 i.e only 1 missing number in sequence is allowed
            assumed_first_num, temp, count, flag=int(string[0:i+1]), string, 0, False

            temp=temp.replace(str(assumed_first_num), '', 1)
            count += 1

            for _ in range(100):
                changed = False
                if(temp.startswith(str(assumed_first_num+1)) or temp.startswith(str(assumed_first_num+2))):
                    next_assumed_first_num=assumed_first_num+1 if temp.startswith(str(assumed_first_num+1)) else assumed_first_num+2
                    temp=temp.replace((str(assumed_first_num+1) if temp.startswith(str(assumed_first_num+1)) else str(assumed_first_num+2)), '', 1)
                    assumed_first_num, changed, count=next_assumed_first_num, True, count+1
                if len(temp) == 0:
                    flag=True
                    break
                if not changed:
                    flag=False
                    break
            if(flag):
                first_num=int(string[0:i+1])
                break
            else:
                continue
    return first_num

test_strings=["789101112", "910111213", "91112131415", "1214161820", "891089118912", "890892893894", "123451234712348", "1234567123456812345691234570"]
for string in test_strings:
    print("First number "+str(get_first_num(string))+" determined from string: "+string)

上述程序的输出是:

$ python3 script.py 
First number 7 determined from string: 789101112
First number 9 determined from string: 910111213
First number 9 determined from string: 91112131415
First number 12 determined from string: 1214161820
First number 8910 determined from string: 891089118912
First number 890 determined from string: 890892893894
First number 12345 determined from string: 123451234712348
First number 1234567 determined from string: 1234567123456812345691234570

问题的主要挑战和棘手的部分是确定系列中的第一个数字;所以到目前为止我只写了确定第一个数字的函数。我现在要睡觉了,但我会扩展程序,以确定序列中缺少的数字,与此相比不应该花费太多时间。

最里面的for循环中的 100 和该循环的count字面计算给定系列的一致性,最多100个元素作为样本,保证是是一致的。

答案 1 :(得分:0)

我认为您的代码问题是最后一行j+=ii永远不会增加!因此,它总是为1.此外,在序列期间,例如从9到10,它突然变为2位数,然后您还需要配置下一个数字以查看它是否仍然正常。

因为保证序列加1,所以这个替代代码可能是一个懒惰的解决方案:

def start_seq(num_str):
    '''determines the number the sequence starts with (minimum 3 numbers
    inside sequence)'''
    for i in range(1,len(num_str)//3):
        num1 = int(num_str[0:i])
        num2 = num1 + 1
        num3 = num2 + 1
        seq = str(num1)+str(num2)+str(num3)
        if num_str.startswith(seq):
            return num1
        else:
            continue
    return -1

def output_sequence(num_str):
    num1 = start_seq(num_str)
    if num1 == -1:            
        print('not a valid sequence')
        raise ValueError

    processed_seq = str(num1)

    counter = 0 # for printing '\n'
    #continue until reaching the end of sequence.
    while processed_seq !=num_str:
        print(num1,end=' ')
        if counter%2 == 1:
            print('\n',end='')
        num1 += 1
        processed_seq += str(num1)
        counter += 1

if __name__ == '__main__':
    output_sequence('789101112')