从单词表中查找字符串和替换

时间:2019-11-23 09:13:57

标签: python dictionary

我有test.txt文件,从单词表中查找字符串和子字符串

<aardwolf>
<Aargau>
<Aaronic>
<aac>
<akac>
<abaca>
<abactinal>
<abacus>  

test.py文件

import sys  # the sys module
import os
import re
def hasattr(str,list):
    expr = re.compile(str)
    # yield the elements
    return [elem for elem in list if expr.match(elem)]

isword = {}
FH = open(sys.argv[1],'r',encoding="ISO-8859-1")
for strLine in FH.readlines():  isword.setdefault(''.join(sorted(strLine[1:strLine.find('>')].upper())),[]).append(strLine[:-1])
print (isword)
basestring=str()
for ARGV in sys.argv[2:]:
    print ("\n*** %s\n" %ARGV )#print Argv

diffpatletters = re.compile(u'[a-zA-Z]').findall(ARGV.upper())
#print (diffpatletters)
diffpat = '.*' + '(.*)'.join(sorted(diffpatletters)) + '.*'
#print (diffpat)
for KEY in hasattr(diffpat,isword.keys()):
#       print (KEY)
       SUBKEY = KEY
       for X in diffpatletters:
         #print (X)
         SUBKEY1 = SUBKEY.replace(X,'')
          #print (SUBKEY)
       if SUBKEY1 in isword:
           #print (SUBKEY)
           basestring+=  "%s -> %s" %(isword[KEY], isword[SUBKEY1])
print (basestring + "\n")

下面是在命令行中运行文件

python test.py test.txt  aack aadfl

预计会在第二个参数之后找到匹配的字符串和子字符串。My basestring not printing

1 个答案:

答案 0 :(得分:7)

您必须使用正则表达式吗? 如果没关系,您想要这样的结果吗?

for element in array:

对于列表类型的结果:

with open('test.txt', 'r')as f:
    s = f.read()
s = s.split('\n')
s

Out[1]:
['<aardwolf>',
 '<Aargau>',
 '<Aaronic>',
 '<aac>',
 '<akac>',
 '<abaca>',
 '<abactinal>',
 '<abacus>  ']

对于字典类型的结果

ARGVs = ['aard', 'onic', 'abacu']

matches = [x for x in s for arg in ARGVs if arg.lower() in x.lower()]
print(matches)

Out[2]:
['<aardwolf>', '<Aaronic>', '<abacus>  ']

使用RegExp

ARGVs = ['aard', 'onic', 'abacu', 'aaro', 'ac']

{key:[x for x in s if key in x] for key in ARGVs if len([x for x in s if key in x]) != 0}

Out[3]:

{'aard': ['<aardwolf>'],
 'onic': ['<Aaronic>'],
 'abacu': ['<abacus>  '],
 'ac': ['<aac>', '<akac>', '<abaca>', '<abactinal>', '<abacus>  ']}