Question

def encode (plainText):
    res=''
    a=''
    for i in plainText:
        if a.count(i)>0:
           a+=i
        else:
            if len(a)>3:
                res+="/" + str(len(a)) + a[0][:1]
            else:
                res+=a
                a=i
    return(res)

这是我当前的代码。对于那些了解游程长度编码的人来说，它可以使文件变大，因为单个值变为两个。我正在尝试将最小长度设置为3，以使其实际压缩。任何有关代码更正的帮助，我们将不胜感激。

Answer 1

您在这里犯了很多错误，这里有一个小清单。

1  def encode (plainText):
2      res=''
3      a=''
4      for i in plainText:
5          if a.count(i)>0:
6             a+=i
7          else:
8              if len(a)>3:
9                  res+="/" + str(len(a)) + a[0][:1]
10             else:
11                 res+=a
12                 a=i
13     return(res)

第[3]，[5]行：将字母存储在字符串中，然后重复呼叫计数。简单地存储最后一个char并添加新变量作为计数器会更容易（并且更快）。
第[8] [9]行：每当遇到重复字符时，就添加（正确的）编码字符串。但是，您永远不会更新a。因此，一旦您第一次到达这一行代码，每个下一个字符都会添加相同的编码字符串。解决方法很简单，将[12]行缩进一个位置，以便在两种情况下都分配新字符。
第[13]行：您输出时未添加最后一个字符。迭代编码适用于先前的字符。这意味着在循环结束后必须处理字符串中的最后一个字符。
最后但并非最不重要的一点是，由于您将/用作特殊字符，因此当它作为非重复代码出现时，应以某种方式进行处理。例如，纯文本/12a将被编码为/12a，然后被解码为12个a的序列。

这是一些（希望的）工作示例：

def encode (plainText):
    ## Lazy solution for some of the edge cases
    if plainText is None or len(plainText) == 0:
        return plainText

    ## We can join this together
    ## and have faster arbitrary 
    ## str length addition
    res=[]
    ## We only care about the last
    ## character, no need to save all
    prev_char=''
    ## And count it ourselves, its
    ## faster then calling count
    count=0
    for i in plainText:
        if i == prev_char:
            ## If its the same char
            ## increment count
            count += 1
            ## and then continue with next
            ## cycle. Avoid unneccasary indent.
            continue

        ## else
        if count > 2 or prev_char == '/':
            ## If more then 2 occurances
            ## we build the encoding.
            ## for 3 occurances the length is the same.
            ## '/' is a special character so we
            ## always encode it
            res.append(
                f"/{count}{prev_char}"
            )
        else:
            ## Otherwise just append the symbols
            res.append(prev_char*count)
        ## We observed at least 1 instance of i 
        count = 1
        ## Store for next comparison
        prev_char = i

    ## Now deal with last character.
    ## Without this your string would miss it.
    if count > 2 or prev_char == '/':
        res.append(
            f"/{count}{prev_char}"
        )
    else:
        res.append(prev_char*count)

    ## And build our string
    return ''.join(res)

设置游程编码长度的最小限制

1 个答案: