“海洋”一词由五个连续的,重叠的州邮政缩写组成:马萨诸塞州(MA),阿肯色州(AR),罗德岛(RI),印第安纳州(IN)和内布拉斯加州(NE)。找到一个具有相同属性的七个字母的单词。
我使用Python打开一个大约5000个单词的列表。我想先找到一个包含5个州缩写的单词。
def puzzleH(word):
states = ['al', 'ak', 'az', 'ar', 'ca', 'co', 'ct', 'dc', 'de', 'fl', 'ga',
'hi', 'id', 'il', 'in', 'ia', 'ks', 'ky', 'la', 'me', 'md',
'ma', 'mi', 'mn', 'ms', 'mo', 'mt', 'ne', 'nv', 'nh', 'nj',
'nm', 'ny', 'nc', 'nd', 'oh', 'ok', 'or', 'pa', 'ri', 'sc',
'sd', 'tn', 'tx', 'ut', 'vt', 'va', 'wa', 'wv', 'wi', 'wy']
checker = 0;
for st in states:
if st in word:
checker+=1
if checker==5:
# ...still thinking...
#pos = (i for i,st in enumerate(word) if st in states)
#for i in pos: print(i)
#return word
# Main program
ListH = []
for word in wordList:
if puzzleH(word)!=None:
ListH.append(puzzleH(word))
找到包含5个状态缩写的单词后,我将找到每个状态缩写的索引。并将这些索引的列表与[0,1,2,3,4]或[1,2,3,4,5]或[2,3,4,5,6]进行比较。但是我不知道该怎么做。 欢迎使用任何有效的新算法。谢谢你。
--Tuan--
答案 0 :(得分:1)
为什么不使用st in word
,而不使用word.find( st )
,它会返回匹配的索引,即-1
。然后只存储找到的索引
def puzzleH( word ):
states = ['al', 'ak', 'az', 'ar', 'ca', 'co', 'ct', 'dc', 'de', 'fl', 'ga',
'hi', 'id', 'il', 'in', 'ia', 'ks', 'ky', 'la', 'me', 'md',
'ma', 'mi', 'mn', 'ms', 'mo', 'mt', 'ne', 'nv', 'nh', 'nj',
'nm', 'ny', 'nc', 'nd', 'oh', 'ok', 'or', 'pa', 'ri', 'sc',
'sd', 'tn', 'tx', 'ut', 'vt', 'va', 'wa', 'wv', 'wi', 'wy']
found_list = []
for st in states:
position = word.find( st )
if ( position != -1 ):
found_list.append( ( st, position ) ) # <-- Keep the word + position
if ( len( found_list ) >= 5 ):
print("[%s]: " % ( word ) )
for state, position in found_list:
print( " \"%s\" at %d" % ( state, position ) )
for word in [ 'marine', 'desert', 'dessert', 'icecream', 'chocolate', 'ohmmeter', 'comically' ]:
puzzleH( word )
哪个给:
$ python3 ./state_find.py
[marine]:
"ar" at 1
"in" at 3
"ma" at 0
"ne" at 4
"ri" at 2
编辑:针对Linux词典文件进行测试:
words = open( '/usr/share/dict/words.pre-dictionaries-common', 'rt' ).read().split('\n')
for word in words:
if ( word.find( "'" ) == -1 ):
puzzleH( word )
给出很多结果:
# (just the tail ...)
[windowpane]:
"in" at 1
"ne" at 8
"nd" at 2
"pa" at 6
"wi" at 0
[windowpanes]:
"in" at 1
"ne" at 8
"nd" at 2
"pa" at 6
"wi" at 0
[windstorms]:
"in" at 1
"ms" at 8
"nd" at 2
"or" at 6
"wi" at 0
[windward]:
"ar" at 5
"in" at 1
"nd" at 2
"wa" at 4
"wi" at 0
哦,“爱护别人”是一个很好的选择:
[philandering]:
"de" at 6
"hi" at 1
"il" at 2
"in" at 9
"la" at 3
"nd" at 5
"ri" at 8
编辑:似乎没有像我应该的那样阅读规范。单词必须完全由重叠状态码组成。
这是一个修复该问题的版本。它从输入的单词创建成对的字母,寻找要匹配的状态码,如果找到,则记录位置和状态码(与之前相同)。
def puzzleH( word ):
states = ['al', 'ak', 'az', 'ar', 'ca', 'co', 'ct', 'dc', 'de', 'fl', 'ga',
'hi', 'id', 'il', 'in', 'ia', 'ks', 'ky', 'la', 'me', 'md',
'ma', 'mi', 'mn', 'ms', 'mo', 'mt', 'ne', 'nv', 'nh', 'nj',
'nm', 'ny', 'nc', 'nd', 'oh', 'ok', 'or', 'pa', 'ri', 'sc',
'sd', 'tn', 'tx', 'ut', 'vt', 'va', 'wa', 'wv', 'wi', 'wy']
found_list = []
word_position = 0
for i in range( len( word ) - 1 ):
two_letters = word[i] + word[i+1]
if ( two_letters in states ):
found_list.append( ( two_letters, i ) )
else:
found_list = []
break # word needs to be made of all state-codes
if ( len( found_list ) >= 5 ):
print("[%s]: " % ( word ) )
for state, position in found_list:
print( " \"%s\" at %d" % ( state, position ) )
words = open( '/usr/share/dict/words.pre-dictionaries-common', 'rt' ).read().split('\n')
for word in words:
if ( word.find( "'" ) == -1 ):
puzzleH( word )
发现的最长的是:
[malarial]:
"ma" at 0
"al" at 1
"la" at 2
"ar" at 3
"ri" at 4
"ia" at 5
"al" at 6
有趣的是,在整个73,000个单词的词典中,只有4个单词(> = 5个代码)。
答案 1 :(得分:0)
您可以创建一个将状态缩写映射到索引的字典,然后使用给定单词的自身字母(但偏移量为1)进行压缩来遍历给定单词的相邻字母对,然后在字典中查找一对字母,然后如果找到,则将相应的索引添加到输出列表中:
state_indices = {state: index for index, state in enumerate(states)}
def puzzleH(word):
indices = []
for pair in zip(word, word[1:]):
candidate = ''.join(pair)
if candidate not in state_indices:
break
indices.append(state_indices[candidate])
else:
return indices
for word in wordList:
indices = puzzleH(word)
if indices is not None:
print(word, indices)
答案 2 :(得分:0)
如果目的是仅识别包含五个连续的,重叠的州邮政缩写的单词,则可以尝试以下操作:
states = ['al', 'ak', 'az', 'ar', 'ca', 'co', 'ct', 'dc', 'de', 'fl', 'ga',
'hi', 'id', 'il', 'in', 'ia', 'ks', 'ky', 'la', 'me', 'md',
'ma', 'mi', 'mn', 'ms', 'mo', 'mt', 'ne', 'nv', 'nh', 'nj',
'nm', 'ny', 'nc', 'nd', 'oh', 'ok', 'or', 'pa', 'ri', 'sc',
'sd', 'tn', 'tx', 'ut', 'vt', 'va', 'wa', 'wv', 'wi', 'wy']
def check_word(word):
if len(word) != 7:
return False
counter = 0
for i in range(0, len(word) - 1):
abv = word[i] + word[i + 1]
if abv in states:
counter += 1
if counter == 5:
return True
else:
counter = 0
return False
for w in wordList:
print("{0} : {1}".format(w, check_word(w)))