我尝试创建一个修改过的LZW,它会在字符串中找到单词的模式。我的问题是第一个元素是'',如果它在列表中,则不检查最后一个元素。我在这里看到了伪代码:https://www.cs.duke.edu/csed/curious/compression/lzw.html。这是我的压缩脚本:
string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string.split():
print (c)
print (x)
#x = x + " " + c
if x in diction:
x += " " + c
#print("debug")
else:
#print(x)
diction.append(x)
x = c
count +=1
#print(count)
print (diction)
我尝试通过在字符串的末尾“附加”一个随机单词来解决第二个问题,但我不认为这是最好的解决方案。
对于第一个问题,我试图将变量“x”定义为str或None但是我得到了这个< class'str'>在列表中。
答案 0 :(得分:0)
链接处理字符并且拆分字符串将给出一个单词数组。 为了不在字典中获取空字符串并解析最后一个元素。
string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string.split():
print (c)
if x+" "+c in diction:
x += " " + c
else:
diction.append(x+" "+c)
x = c
count +=1
print (diction)
但也许你会喜欢这样的东西:
string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string:
print (c)
if x+c in diction:
x += c
else:
diction.append(x+c)
x = c
count +=1
print (diction)
答案 1 :(得分:0)
我不确定代码假装的是什么,但要解决您提到的问题,我认为您可以这样做:
string = 'this is a test a test this is pokemon'
diction = []
x = None
count = 0
for c in string.split():
if x in diction:
x += " " + c
else:
if x: diction.append(x)
x = c
count += 1
if not x in diction: diction.append(x)
print (diction)
该代码的输出为:
['this', 'is', 'a', 'test', 'a test', 'this is', 'pokemon']