Question

我尝试创建一个修改过的LZW，它会在字符串中找到单词的模式。我的问题是第一个元素是''，如果它在列表中，则不检查最后一个元素。我在这里看到了伪代码：https://www.cs.duke.edu/csed/curious/compression/lzw.html。这是我的压缩脚本：

string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string.split():
      print (c)
      print (x)
      #x = x + " " + c 
      if x in diction:
            x += " " + c
            #print("debug")
      else:
            #print(x)
            diction.append(x)
            x = c
            count +=1
            #print(count)

print (diction)

我尝试通过在字符串的末尾“附加”一个随机单词来解决第二个问题，但我不认为这是最好的解决方案。

对于第一个问题，我试图将变量“x”定义为str或None但是我得到了这个＆lt; class'str'＆gt;在列表中。

Answer 1

链接处理字符并且拆分字符串将给出一个单词数组。为了不在字典中获取空字符串并解析最后一个元素。

string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string.split():
      print (c)
      if x+" "+c in diction:
            x += " " + c
      else:
            diction.append(x+" "+c)
            x = c
            count +=1

print (diction)

但也许你会喜欢这样的东西：

string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string:
      print (c)
      if x+c in diction:
            x += c
      else:
            diction.append(x+c)
            x = c
            count +=1

print (diction)

Answer 2

我不确定代码假装的是什么，但要解决您提到的问题，我认为您可以这样做：

string = 'this is a test a test this is pokemon'
diction = []
x = None
count = 0
for c in string.split():
    if x in diction:
        x += " " + c
    else:
        if x: diction.append(x)
        x = c
        count += 1

if not x in diction: diction.append(x)

print (diction)

该代码的输出为：

['this', 'is', 'a', 'test', 'a test', 'this is', 'pokemon']

模式在字符串python中查找

2 个答案: