def encode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>3:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)
这是我当前的代码。对于那些了解游程长度编码的人来说,它可以使文件变大,因为单个值变为两个。我正在尝试将最小长度设置为3,以使其实际压缩。任何有关代码更正的帮助,我们将不胜感激。
答案 0 :(得分:0)
您在这里犯了很多错误,这里有一个小清单。
1 def encode (plainText):
2 res=''
3 a=''
4 for i in plainText:
5 if a.count(i)>0:
6 a+=i
7 else:
8 if len(a)>3:
9 res+="/" + str(len(a)) + a[0][:1]
10 else:
11 res+=a
12 a=i
13 return(res)
a
。因此,一旦您第一次到达这一行代码,每个下一个字符都会添加相同的编码字符串。解决方法很简单,将[12]行缩进一个位置,以便在两种情况下都分配新字符。/
用作特殊字符,因此当它作为非重复代码出现时,应以某种方式进行处理。例如,纯文本/12a
将被编码为/12a
,然后被解码为12个a
的序列。 这是一些(希望的)工作示例:
def encode (plainText):
## Lazy solution for some of the edge cases
if plainText is None or len(plainText) == 0:
return plainText
## We can join this together
## and have faster arbitrary
## str length addition
res=[]
## We only care about the last
## character, no need to save all
prev_char=''
## And count it ourselves, its
## faster then calling count
count=0
for i in plainText:
if i == prev_char:
## If its the same char
## increment count
count += 1
## and then continue with next
## cycle. Avoid unneccasary indent.
continue
## else
if count > 2 or prev_char == '/':
## If more then 2 occurances
## we build the encoding.
## for 3 occurances the length is the same.
## '/' is a special character so we
## always encode it
res.append(
f"/{count}{prev_char}"
)
else:
## Otherwise just append the symbols
res.append(prev_char*count)
## We observed at least 1 instance of i
count = 1
## Store for next comparison
prev_char = i
## Now deal with last character.
## Without this your string would miss it.
if count > 2 or prev_char == '/':
res.append(
f"/{count}{prev_char}"
)
else:
res.append(prev_char*count)
## And build our string
return ''.join(res)