Python:查找字符串的明显缩写

时间:2018-04-17 19:41:59

标签: python

我有一个字符串列表,我希望将其缩写为最短的不同形式。

原始字符串:

[ERROR] behavior.box :_safeCallOfUserMethod:281 _Behavior__lastUploadedChoregrapheBehaviorbehavior_1338328200__root__test_1: Traceback (most recent call last):

File "C:\PROGRA~2\ALDEBA~1\CHOREG~1.1\lib\naoqi.py", line 271, in _safeCallOfUserMethod

func()

File "<string>", line 23, in onInput_onStart

File "C:\Users\AppData\Roaming\PackageManager\apps\.lastUploadedChoregrapheBehavior\behavior_1\..\lib\apiai\__init__.py", line 9, in <module>

from .requests.query import Entry

File "C:\Users\AppData\Roaming\PackageManager\apps\.lastUploadedChoregrapheBehavior\behavior_1\..\lib\apiai\requests\__init__.py", line 3, in <module>

from .request import Request

File "C:\Users\loadedChoregrapheBehavior\behavior_1\..\lib\apiai\requests\request.py", line 9, in <module>

from httplib import HTTPSConnection

ImportError: cannot import name HTTPSConnection

缩写字符串:

topology
track
translate
trunk
tunnel
ucse
udp
usb
user-group

我如何在python(3)中做到这一点?

2 个答案:

答案 0 :(得分:0)

from collections import Counter

words = """\
topology
track
translate
trunk
tunnel
ucse
udp
usb
user-group""".splitlines()


def prefixes(word):
    for i in range(1, len(word) + 1):
        yield word[:i]


def main():
    prefix_counts = Counter()
    for word in words:
        prefix_counts.update(prefixes(word))
    for word in words:
        for prefix in prefixes(word):
            if prefix_counts[prefix] == 1:
                print(prefix)
                break
        else:
            # word is a prefix of another word
            print(word)


main()

答案 1 :(得分:0)

不是效率最高的,但是使用列表理解可以做到这一点:

myList = [
    'topology',
    'track',
    'translate',
    'trunk',
    'tunnel',
    'ucse',
    'udp',
    'usb',
    'user-group'
]

abbrevs = [
    next(
         word[:k] for k in range(1, len(word)+1) 
         if k==len(word) or not any(other.startswith(word[:k])
                                    for other in myList if word!=other)
    )
    for word in myList
]

print(abbrevs)
#['to', 'trac', 'tran', 'tru', 'tu', 'uc', 'ud', 'usb', 'use']