在列表中找到类似的单词

时间:2015-02-15 01:24:50

标签: python

我试图制作过滤器机器人,可以在阻止列表中列出的消息中找到相同的字符或相同的单词

block = ["damn", "shit"]

如果消息计数:daaaaammmmnnnn或sssssshit或da-mn或da.mn.像抓住它并返回True

我该怎么办呢。

THX。

3 个答案:

答案 0 :(得分:0)

当邮件包含列表flag中的任何项目时,将True值设置为block

  1. 获取block列表并按for循环遍历列表中的每个项目。
  2. 默认情况下,将flag设置为False表示邮件不包含任何阻止列表项。
  3. 当邮件中的flag列表中的项目时,将True设置为block
  4. 使用break语句停止for循环迭代。
  5. e.g。

    >>> block = ["damn", "shit"]
    >>> msg = "test masf to check shit or damn"
    >>> flag = False
    >>> for i in block:
    ...   if i in msg:
    ...      flag = True
    ...      break
    ... 
    >>> flag
    True
    >>> 
    

    更新:

    1. 设置block列表单词。
    2. 创建阻止列表字符列表的列表。
    3. 从输入字符串中删除标点符号。
    4. 将字符串拆分为单词。
    5. 拆分单词并从单词创建字符序列列表。
    6. 检查阻止列表字符列表中的字符序列列表。
    7. 代码:

      import string
      block = ["damn", "shit"]
      block_char_list = [ list(i) for i in block ]
      
      def  getbBlockListFlag(msg):
          #Remove punctuation
          tmp = msg.translate(string.maketrans("",""), string.punctuation)
          tmp1 = []
          for word in tmp.split():
              j = []
              for k in list(word):
                  if k not in j:
                      j.append(k)
                  elif k != j[-1]:
                      j.append(k)
      
              # word in block list
              if j in block_char_list:
                  return True, j
      
          return False, []
      
      
      
      msg = "True test case ? daaaammmmnnnnn.,"
      status, block_char = getbBlockListFlag(msg)
      if status:  
          print "The word `%s` from block list is presnt in input '%s'"%(''.join(block_char), msg)
      else:
          print "No word from block list is presnt in input '%s'"%(msg)
      
      msg = "True test case ? normal damn.,"
      status, bloak_char = getbBlockListFlag(msg)
      if status:  
          print "The word `%s` from block list is presnt in input '%s'"%(''.join(block_char), msg)
      else:
          print "No word from block list is presnt in input '%s'"%(msg)
      
      msg = "False test case ? nothing when sequance aaammdddnnn.."
      status, bloak_char = getbBlockListFlag(msg)
      if status:  
          print "The word `%s` from block list is presnt in input '%s'"%(''.join(block_char), msg)
      else:
          print "No word from block list is presnt in input '%s'"%(msg)
      
      
      msg = "False test case ? nothing.."
      status, bloak_char = getbBlockListFlag(msg)
      if status:  
          print "The word `%s` from block list is presnt in input '%s'"%(''.join(bloak_char), msg)
      else:
          print "No word from block list is presnt in input '%s'"%(msg)
      msg = "True test case ? Handle  daaaammaaaammnnnnn.,"
      status, block_char = getbBlockListFlag(msg)
      if status:  
          print "The word `%s` from block list is presnt in input '%s'"%(''.join(block_char), msg)
      else:
          print "No word from block list is presnt in input '%s'"%(msg)
      

      输出:

      vivek@vivek:~/Desktop/stackoverflow$ python 16.py 
      The word `damn` from block list is presnt in input 'True test case ? daaaammmmnnnnn.,'
      The word `damn` from block list is presnt in input 'True test case ? normal damn.,'
      No word from block list is presnt in input 'False test case ? nothing when sequance aaammdddnnn..'
      No word from block list is presnt in input 'False test case ? nothing..'
      No word from block list is presnt in input 'True test case ? Handle  daaaammaaaammnnnnn.,'
      

答案 1 :(得分:0)

from collections import OrderedDict

string = "test this string to check for sshh-iitit and dammm.mn"
new_string = string.replace('-','')
final_string = new_string.replace('.','')
words = final_string.split()
for word in words:
    check = "".join(OrderedDict.fromkeys(word))
    if check == "damn" or check == "shit":
        print "true"

答案 2 :(得分:-1)

你应该看一下Levenshtein distance! PHP有一个很好的函数叫做leventhstein()。而对于Phython来说,最少有一个C实现:https://pypi.python.org/pypi/python-Levenshtein/0.12.0