在python中查找字符串中的缩写

时间:2013-12-04 13:52:51

标签: python

让我们假设我们有一些可能的字符组合:

mystr = 'NRWTD'
my2str = RAWBC'

现在我所知道的就是:

vdCacheType = {'AWB' : 'Always WriteBack', 'WB': 'Write Back',
               'NR': 'No Read Ahead', 'Ra': 'Read Ahead Adaptive',
               'WT': 'Write Through',  'R' : 'Read Ahead Always',
               'D': 'Direct IO', 'C': 'Cached' }

正如您所看到的,字符串是缩写为Character / s的组合。我的问题是如何获取字符串,并检查字符组合是否可以在字典中找到。

我已经尝试过:

for x in vdCacheType:
    if x in mystr:
        print x # Here i would save the found abbr. in a list for later use
        mystr = mystr.strip(x)

问题是NRWTD发现:

Found Char:  R
New String:  NRWTD
Found Char:  WT
New String:  NRWTD
Found Char:  NR
New String:  WTD
Found Char:  D
New String:  WT

我的意图是回归:

  

无需预读,直写,直接

而不是NRWTD 如果有更好的方法来解决这个问题我会很感激。不管怎样,谢谢!

2 个答案:

答案 0 :(得分:5)

找到最长的子串:

vdCacheType = {'AWB' : 'Always WriteBack', 'WB': 'Write Back',
               'NR': 'No Read Ahead', 'Ra': 'Read Ahead Adaptive',
               'WT': 'Write Through',  'R' : 'Read Ahead Always',
               'D': 'Direct IO', 'C': 'Cached' }

import re
rx = re.compile('|'.join(sorted(vdCacheType, key=len, reverse=True)))
print ', '.join([vdCacheType[m] for m in rx.findall('NRWTD')])
# No Read Ahead, Write Through, Direct IO

RAWBC出现为:Read Ahead Always, Always WriteBack, Cached

根据区分大小写进行调整,以及整个文本是否应该是完整的首字母缩略词(或系列)。

答案 1 :(得分:1)

Jon Clemens的解决方案是正确的,但这是另一种解决方案。

我必须单独列出保存订单的密钥。如果我使用vdCacheType.keys()列出,则按此顺序出现:['R', 'C', 'WT', 'WB', 'NR', 'AWB', 'D', 'RA']这将无效。

str.strip()在这种情况下不起作用,因为字符串之间没有空格。

vdCacheType = {'AWB' : 'Always WriteBack', 'WB': 'Write Back',
           'NR': 'No Read Ahead', 'RA': 'Read Ahead Adaptive',
           'WT': 'Write Through',  'R' : 'Read Ahead Always',
           'D': 'Direct IO', 'C': 'Cached' }

vdCacheKeys = ['AWB','WB','NR','RA','WT','R','D','C']

mystr = 'NRWTD'
my2str = 'RAWBC'

listAbbr = []
result = ''
index = 0 


print vdCacheType.keys()
for x in vdCacheKeys:
    if x in mystr:
        listAbbr.append(x)
        index = mystr.find(x)
        mystr = mystr[:index]+' ' + mystr[index +len(x):]
        print mystr
        result+=vdCacheType[x]  + ', '
    # print x # Here i would save the found abbr. in a list for later use
print result

输出No Read Ahead, Write Through, Direct IO,