Question

检查字符串的排序列表时，列表问题中的字符串略有修改。

我正在检查表示某些文件内容的字符串。而且我有一个要检查的某些字符串列表，但是，有时，同一字符串的末尾可以附加一个星号（*），这会导致此列表中的重复项稍作修改。

当前：

  # This is minimal very minimal code example: 
  for _word in sorted(['Microsoft','Microsoft*']): 
      print(_word)

所需：

for _word in sorted(['Microsoft']):
    print(_word)

 # But still be able to check for 'Microsoft*' without having duplicates in the list.

最终解决方案：

import os
import sys

if __name__ == '__main__':

    default_strings = sorted([
        'microsoft',
        'linux',
        'unix',
        'android'
    ])

    text = str("""
        Microsoft* is cool, but Linux is better. 
     """)

    tokens = text.split(" ")
    for token in tokens:
        token = token.lower()
        if token.endswith('*'): 
            token = token[:-1]

        if token in default_strings:
            print(token)

编辑：如果有更好的方法，请告诉我。非常感谢参与和响应的每个人。

Answer 1

基于我的评论；如果您不想使用set，

my_filters = ['*']
my_list = ['Microsoft','Microsoft*']
final_list = [ x.replace(y,'')  for x in my_list for y in my_filters if y in x ]

但是我要说的是，首先过滤输入列表中的特殊字符，然后将其转换（转换）为set。

有没有更有效的方法来检查列表中的字符串，以避免（略微）修改重复项？

1 个答案: