这是一个问题,只要字符串列表和文档找到包含列表中所有字符串的最短子字符串。
因此,
FirebaseAuth
输出将是,
document = 'many google employees can program can google employees because google is a technology company that writes program'
searchTerms = ['google', 'program', 'can']
这是我的方法, 将文档拆分为后缀树, 检查每个后缀中的所有字符串 返回最短的一个,
这是我的代码
can google employees because google is a technology company that writes program
这是一个在线提交,并没有通过一个测试用例。我不知道测试用例是什么。我的问题是,代码中是否存在逻辑错误。还有一种更有效的方法。
答案 0 :(得分:2)
您可以将其分为两部分。首先,找到匹配某些属性的最短子字符串。我们假装我们已经有了一个测试该属性的函数:
def find_shortest_ss(document, some_property):
# First level of looping gradually increases substring length
for x in range(len(document)):
# Second level of looping tests current length at valid positions
for y in range(max(len(document), len(document)-x)):
if some_property(document[y:x+y]):
return document[y:x+y]
# How to handle the case of no match is undefined
raise ValueError('No matching value found')
现在我们要测试自己的属性:
def contains_all_terms(terms):
return (lambda s: all(term in s for term in terms))
这个lambda
表达式需要一些术语,并且会返回一个函数,当对字符串求值时,当且仅当所有项都在字符串中时才返回true。这基本上是嵌套函数定义的更简洁版本,您可以这样编写:
def contains_all_terms(terms):
def string_contains_them(s):
return all(term in s for term in terms)
return string_contains_them
所以我们实际上只是返回我们在contains_all_terms
函数内动态创建的函数的句柄
要将它们拼凑在一起我们确实如此:
>>> find_shortest_ss(document, contains_all_terms(searchTerms))
'program can google'
此代码具有一些效率优势:
any
内置函数具有短路评估功能,这意味着一旦找到不包含的子字符串,它就会返回False
首先检查所有最短的子串,然后一次增加一个额外字符长度的子串长度。如果它找到了令人满意的子字符串,它将退出并返回该值。因此,您可以保证返回的值永远不会超过必要的时间。它甚至不会对子字符串进行任何超过必要的操作。
8行代码,我认为还不错
答案 1 :(得分:0)
蛮力是 rootRef.child("users").observeEventType(.Value, withBlock: {(snap) in
if let userDict = snap.value as? [String:AnyObject]{
for each in userDict as [String: AnyObject] {
let autoID = each.0
//Here you retrieve your autoID
rootRef.child("users").child(autoID).child("player1").observeEventType(.Value, withBlock: {(playersDict) in
if let playerDictionary = playerDict.value as? [String:AnyObject]{
let emailID = playerDictionary["email"] as! String
//print(emailID)
}
})
}
}
})
,为什么不呢:
O(n³)
但你可以更快地完成这项工作。例如,任何相关的子字符串只能以其中一个关键字结尾
答案 2 :(得分:0)
而不是强制所有可能的子字符串,我粗暴强迫所有可能匹配的字位...它应该更快一点..
import numpy as np
from itertools import product
document = 'many google employees can program can google employees because google is a technology company that writes program'
searchTerms = ['google', 'program']
word_lists = []
for word in searchTerms:
word_positions = []
start = 0 #starting index of str.find()
while 1:
start = document.find(word, start, -1)
if start == -1: #no more instances
break
word_positions.append([start, start+len(word)]) #beginning and ending index of search term
start += 1 #increment starting search postion
word_lists.append(word_positions) #add all search term positions to list of all search terms
minLen = len(document)
lower = 0
upper = len(document)
for p in product(*word_lists): #unpack word_lists into word_positions
indexes = np.array(p).flatten() #take all indices into flat list
lowerI = np.min(indexes)
upperI = np.max(indexes)
indexRange = upperI - lowerI #determine length of substring
if indexRange < minLen:
minLen = indexRange
lower = lowerI
upper = upperI
print document[lower:upper]