按字母顺序查找最长的子字符串

时间:2013-10-27 13:24:20

标签: python recursion

编辑:我知道在SO中已经提出了类似任务的问题,但我很想知道这段特定代码中的问题。我也知道这个问题可以在不使用递归的情况下解决。

任务是编写一个程序,该程序将找到(并打印)字母按字母顺序出现的最长子字符串。如果找到超过1个同样长的序列,则应打印第一个序列。例如,字符串abczabcd的输出将为abcz

我用递归解决了这个问题,似乎通过了我的手动测试。但是,当我运行生成随机字符串的自动化测试集时,我注意到在某些情况下,输出不正确。例如:

如果s = 'hixwluvyhzzzdgd',则输出为hix而不是luvy

如果s = 'eseoojlsuai',则输出为eoo而不是jlsu

如果s = 'drurotsxjehlwfwgygygxz',则输出为dru而不是ehlw

经过一段时间的努力,我无法弄清楚导致这个错误的这些字符串有什么特别之处。

这是我的代码:

pos = 0
maxLen = 0
startPos = 0
endPos = 0


def last_pos(pos):
    if pos < (len(s) - 1):
        if s[pos + 1] >= s[pos]:
            pos += 1
            if pos == len(s)-1:
                return len(s)
            else:
                return last_pos(pos)
        return pos


for i in range(len(s)):
    if last_pos(i+1) != None:
        diff = last_pos(i) - i
    if diff - 1 > maxLen:
        maxLen = diff
        startPos = i
        endPos = startPos + diff

print s[startPos:endPos+1]

17 个答案:

答案 0 :(得分:3)

您的代码中需要改进很多内容,但要进行最少的更改才能使其正常工作。问题是您应该在if last_pos(i) != None:循环for而不是ii+1,并且应该比较diff(而不是diff - 1)反对maxLen。请阅读其他答案,了解如何做得更好。

for i in range(len(s)):
    if last_pos(i) != None:
        diff = last_pos(i) - i + 1
    if diff > maxLen:
        maxLen = diff
        startPos = i
        endPos = startPos + diff - 1

答案 1 :(得分:3)

下面。这样做你想要的。一次通过,不需要递归。

def find_longest_substring_in_alphabetical_order(s):
    groups = []
    cur_longest = ''
    prev_char = ''
    for c in s.lower():
        if prev_char and c < prev_char:
            groups.append(cur_longest)
            cur_longest = c
        else:
            cur_longest += c
        prev_char = c
    return max(groups, key=len) if groups else s

使用它:

>>> find_longest_substring_in_alphabetical_order('hixwluvyhzzzdgd')
'luvy'
>>> find_longest_substring_in_alphabetical_order('eseoojlsuai')
'jlsu'
>>> find_longest_substring_in_alphabetical_order('drurotsxjehlwfwgygygxz')
'ehlw'

注意:它可能会破坏奇怪的字符,仅使用您建议的输入进行测试。由于这是一个“功课”问题,我会给你解决方案,尽管仍有一些优化要做,我想让它有点可以理解。

答案 2 :(得分:2)

您可以使用嵌套的for循环,切片和sorted。如果字符串不是全部小写,则可以在使用str.lower进行比较之前将子字符串转换为小写:

def solve(strs):
  maxx = ''
  for i in xrange(len(strs)):
      for j in xrange(i+1, len(strs)):
          s = strs[i:j+1]
          if ''.join(sorted(s)) == s:
              maxx = max(maxx, s, key=len)
          else:
              break
  return maxx

<强>输出:

>>> solve('hixwluvyhzzzdgd')
'luvy'
>>> solve('eseoojlsuai')
'jlsu'
>>> solve('drurotsxjehlwfwgygygxz')
'ehlw'

答案 3 :(得分:1)

这是一个带有快速循环的单通道解决方案。它只读取每个字符一次。循环内部操作仅限于

  • 1字符串比较(1 char x 1 char)
  • 1个整数增量
  • 2整数减法
  • 1整数比较
  • 1到3个整数赋值
  • 1字符串分配

没有使用容器。没有进行任何函数调用。处理空字符串时没有特殊情况代码。包括chr(0)在内的所有字符代码都已正确处理。如果最长的字母子字符串存在平局,则该函数返回它遇到的第一个获胜子字符串。为了进行字母化,会忽略大小写,但在输出子字符串中保留大小写。

def longest_alphabetical_substring(string):
    start, end = 0, 0     # range of current alphabetical string
    START, END = 0, 0     # range of longest alphabetical string yet found
    prev = chr(0)         # previous character

    for char in string.lower():   # scan string ignoring case
        if char < prev:       # is character out of alphabetical order?
            start = end       #     if so, start a new substring 
        end += 1              # either way, increment substring length 

        if end - start > END - START:  # found new longest?  
            START, END = start, end    #     if so, update longest 
        prev = char                    # remember previous character

    return string[START : END]   # return longest alphabetical substring 

<强>结果

>>> longest_alphabetical_substring('drurotsxjehlwfwgygygxz')
'ehlw'
>>> longest_alphabetical_substring('eseoojlsuai')
'jlsu'
>>> longest_alphabetical_substring('hixwluvyhzzzdgd')
'luvy'
>>>

答案 4 :(得分:1)

简单易行。

代码:

s = 'hixwluvyhzzzdgd' 
r,p,t = '','',''
for c in s:
    if p <= c:
        t += c
        p = c
    else:
        if len(t) > len(r):
            r = t
        t,p = c,c
if len(t) > len(r):
    r = t
print 'Longest substring in alphabetical order is: ' + r

输出

Longest substring in alphabetical order which appeared first: luvy

答案 5 :(得分:0)

s = 'azcbobobegghakl' 

i=1
subs=s[0]
subs2=s[0]
while(i<len(s)):
    j=i
    while(j<len(s)):
        if(s[j]>=s[j-1]):
            subs+=s[j]
            j+=1 
        else:
            subs=subs.replace(subs[:len(subs)],s[i])   
            break

        if(len(subs)>len(subs2)):
            subs2=subs2.replace(subs2[:len(subs2)], subs[:len(subs)])
    subs=subs.replace(subs[:len(subs)],s[i])
    i+=1
print("Longest substring in alphabetical order is:",subs2)

答案 6 :(得分:0)

我同意@Abhijit关于itertools.groupby()的力量,但我采用了一种更简单的方法(ab)使用它并避免了边界案例问题:

from itertools import groupby

LENGTH, LETTERS = 0, 1

def longest_sorted(string):
    longest_length, longest_letters = 0, []
    key, previous_letter = 0, chr(0)

    def keyfunc(letter):
        nonlocal key, previous_letter
        if letter < previous_letter:
            key += 1

        previous_letter = letter
        return key

    for _, group in groupby(string, keyfunc):
        letters = list(group)
        length = len(letters)

        if length > longest_length:
            longest_length, longest_letters = length, letters

    return ''.join(longest_letters)

print(longest_sorted('hixwluvyhzzzdgd'))
print(longest_sorted('eseoojlsuai'))
print(longest_sorted('drurotsxjehlwfwgygygxz'))
print(longest_sorted('abcdefghijklmnopqrstuvwxyz'))

<强>输出

> python3 test.py
luvy
jlsu
ehlw
abcdefghijklmnopqrstuvwxyz
>

答案 7 :(得分:0)

假设这是来自Edx课程: 直到这个问题,我们还没有教过任何有关字符串及其在python中的高级操作的知识 所以,我只需要完成循环和条件语句

string =""             #taking a plain string to represent the then generated string
present =""             #the present/current longest string
for i in range(len(s)): #not len(s)-1 because that totally skips last value
    j = i+1            
    if j>= len(s):    
        j=i           #using s[i+1] simply throws an error of not having index
    if s[i] <= s[j]:  #comparing the now and next value
        string += s[i] #concatinating string if above condition is satisied
    elif len(string) != 0 and s[i] > s[j]: #don't want to lose the last value
        string += s[i] #now since s[i] > s[j] #last one will be printed
        if len(string) > len(present): #1 > 0 so from there we get to store many values
            present = string #swapping to largest string
        string = ""
    if len(string) > len(present): #to swap from if statement
        present = string
if present == s[len(s)-1]: #if no alphabet is in order then first one is to be the output
    present = s[0]
print('Longest substring in alphabetical order is:' + present)

答案 8 :(得分:0)

我提出了这个解决方案

def longest_sorted_string(s):
    max_string = ''
    for i in range(len(s)):
        for j in range(i+1, len(s)+1):
            string = s[i:j]
            arr = list(string)
            if sorted(string) == arr and len(max_string) < len(string):
                max_string = string
    return max_string

答案 9 :(得分:0)

我想这是EDX上CS6.00.1x的问题集问题。这就是我想出来的。

s = raw_input("Enter the string: ")
longest_sub = ""
last_longest = ""
for i in range(len(s)):
    if len(last_longest) > 0:
        if last_longest[-1] <= s[i]:
            last_longest += s[i]
        else:
            last_longest = s[i]
    else:
        last_longest = s[i]
    if len(last_longest) > len(longest_sub):
        longest_sub = last_longest
print(longest_sub)

答案 10 :(得分:0)

s = input("insert some string: ")
start = 0
end = 0
temp = ""
while end+1 <len(s):
    while end+1 <len(s) and s[end+1] >= s[end]:
        end += 1
    if len(s[start:end+1]) > len(temp):
        temp = s[start:end+1]
    end +=1
    start = end
print("longest ordered part is: "+temp)

答案 11 :(得分:0)

简单易懂:

s = "abcbcd"    #The original string

l = len(s)    #The length of the original string

maxlenstr = s[0]    #maximum length sub-string, taking the first letter of original string as value.

curlenstr = s[0]    #current length sub-string, taking the first letter of original string as value.

for i in range(1,l):    #in range, the l is not counted. 

    if s[i] >= s[i-1]:    #If current letter is greater or equal to previous letter,
        curlenstr += s[i] #add the current letter to current length sub-string
    else:        
        curlenstr = s[i]  #otherwise, take the current letter as current length sub-string

    if len(curlenstr) > len(maxlenstr): #if current cub-string's length is greater than max one,
            maxlenstr = curlenstr;      #take current one as max one.

print("Longest substring in alphabetical order is:", maxlenstr)

答案 12 :(得分:0)

def find_longest_order():
`enter code here`arr = []
`enter code here`now_long = ''
    prev_char = ''
    for char in s.lower():
        if prev_char and char < prev_char:
            arr.append(now_long)
            now_long = char
        else:
            now_long += char
        prev_char = char
    if len(now_long) == len(s):
        return now_long
    else:   
        return max(arr, key=len)

def main():
    print 'Longest substring in alphabetical order is: ' + find_longest_order()

main()

答案 13 :(得分:0)

更多循环,但它完成了工作

s = raw_input("Enter string")
fin=""
s_pos =0
while s_pos < len(s):
    n=1
    lng=" "
    for c in s[s_pos:]:
        if c >= lng[n-1]:
            lng+=c
            n+=1
        else :
            break
    if len(lng) > len(fin):
        fin= lng`enter code here`
    s_pos+=1    
print "Longest string: " + fin

答案 14 :(得分:0)

Python具有强大的内置包itertoolsgroupby

中的精彩功能

直观使用按键功能可以提供巨大的里程数。

在这种特殊情况下,您只需跟踪订单更改并相应地对序列进行分组。唯一的例外是必须单独处理的边界情况

<强>代码

def find_long_cons_sub(s):
    class Key(object):
        '''
        The Key function returns 
            1: For Increasing Sequence
            0: For Decreasing Sequence
        '''
        def __init__(self):
            self.last_char = None
        def __call__(self, char):
            resp = True
            if self.last_char:
                resp = self.last_char < char
            self.last_char = char
            return resp
    def find_substring(groups):
        '''
        The Boundary Case is when an increasing sequence
        starts just after the Decresing Sequence. This causes
        the first character to be in the previous group.
        If you do not want to handle the Boundary Case
        seperately, you have to mak the Key function a bit 
        complicated to flag the start of increasing sequence'''
        yield next(groups)
        try:
            while True:
                yield next(groups)[-1:] + next(groups)
        except StopIteration:
            pass
    groups = (list(g) for k, g in groupby(s, key = Key()) if k)
    #Just determine the maximum sequence based on length
    return ''.join(max(find_substring(groups), key = len))

<强>结果

>>> find_long_cons_sub('drurotsxjehlwfwgygygxz')
'ehlw'
>>> find_long_cons_sub('eseoojlsuai')
'jlsu'
>>> find_long_cons_sub('hixwluvyhzzzdgd')
'luvy'

答案 15 :(得分:-1)

first_seq=s[0]
break_seq=s[0]
current = s[0]
for i in range(0,len(s)-1):
    if s[i]<=s[i+1]:
        first_seq = first_seq + s[i+1]
        if len(first_seq) > len(current):
            current = first_seq
    else:
        first_seq = s[i+1]
        break_seq = first_seq
print("Longest substring in alphabetical order is: ", current)

答案 16 :(得分:-1)

s = 'gkuencgybsbezzilbfg'
x = s.lower()
y = ''
z = [] #creating an empty listing which will get filled

for i in range(0,len(x)):
    if i == len(x)-1:
       y = y + str(x[i])
       z.append(y)
       break 
    a = x[i] <= x[i+1]
    if a == True:
        y = y + str(x[i])
    else: 
        y = y + str(x[i])
        z.append(y)  # fill the list 
        y = ''
# search of 1st longest string        
L = len(max(z,key=len))      # key=len takes length in consideration 
for i in range(0,len(z)):
    a = len(z[i])
    if a == L:   
        print 'Longest substring in alphabetical order is:' + str(z[i])
        break