模式在字符串python中查找

时间:2014-03-15 23:26:18

标签: python pattern-matching

我尝试创建一个修改过的LZW,它会在字符串中找到单词的模式。我的问题是第一个元素是'',如果它在列表中,则不检查最后一个元素。我在这里看到了伪代码:https://www.cs.duke.edu/csed/curious/compression/lzw.html。这是我的压缩脚本:

string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string.split():
      print (c)
      print (x)
      #x = x + " " + c 
      if x in diction:
            x += " " + c
            #print("debug")
      else:
            #print(x)
            diction.append(x)
            x = c
            count +=1
            #print(count)

print (diction)

我尝试通过在字符串的末尾“附加”一个随机单词来解决第二个问题,但我不认为这是最好的解决方案。

对于第一个问题,我试图将变量“x”定义为str或None但是我得到了这个< class'str'>在列表中。

2 个答案:

答案 0 :(得分:0)

链接处理字符并且拆分字符串将给出一个单词数组。 为了不在字典中获取空字符串并解析最后一个元素。

string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string.split():
      print (c)
      if x+" "+c in diction:
            x += " " + c
      else:
            diction.append(x+" "+c)
            x = c
            count +=1

print (diction)

但也许你会喜欢这样的东西:

string = 'this is a test a test this is pokemon'
diction = []
x = ""
count = 0
for c in string:
      print (c)
      if x+c in diction:
            x += c
      else:
            diction.append(x+c)
            x = c
            count +=1

print (diction)

答案 1 :(得分:0)

我不确定代码假装的是什么,但要解决您提到的问题,我认为您可以这样做:

string = 'this is a test a test this is pokemon'
diction = []
x = None
count = 0
for c in string.split():
    if x in diction:
        x += " " + c
    else:
        if x: diction.append(x)
        x = c
        count += 1

if not x in diction: diction.append(x)

print (diction)

该代码的输出为:

['this', 'is', 'a', 'test', 'a test', 'this is', 'pokemon']