我再次需要Stack Overflow的明智建议。 我不确定标题是否正确显示了我现在想知道的内容。
事情就是这个。
有两组单词,我需要知道一个字符串在A组中是否有一个(或多个)单词,而在B组中也有一个单词。 像这样。
Group_A = ['nice','car','by','shop']
Group_B = ['no','thing','great']
t_string_A = 'there is a car over there'
t_string_B = 'no one is in a car'
t_string_A具有来自Group_A的“汽车”,而没有来自Group_B的汽车,因此它必须返回...我不知道,比方说0 而t_string_B在Group_A中具有“汽车”,在Group_B中具有“否”,因此它应返回1
实际上,我是通过某种原始方式来完成这项工作的。就像一堆代码一样
if 'nice' in t_string_A and 'no' in t_string_A:
return 1
但是,正如您所知,随着A组或B组的长度增加,我应该制作过多组。这肯定不是有效的。
感谢您的帮助和关注:D 预先感谢!
答案 0 :(得分:5)
您可以使用set
s:
Group_A = set(('nice','car','by','shop'))
Group_B = set(('no','thing','great'))
t_string_A = 'there is a car over there'
t_string_B = 'no one is in a car'
set_A = set(t_string_A.split())
set_B = set(t_string_B.split())
def test(string):
s = set(string.split())
if Group_A & set_A and Group_B & set_A:
return 1
else:
return 0
如果Group_A
和Group_B
中没有单词,结果将是什么?
根据您的短语,这种方式可能会提高测试效率:
def test(string):
s = string.split()
if any(word in Group_A for word in s) and any(word in Group_B for word in s):
return 1
else:
return 0
答案 1 :(得分:1)
您可以使用itertools.product从给定组中生成所有可能的单词对。然后,您遍历字符串列表,如果字符串中存在一对,则结果为True,否则结果为False。
import itertools as it
Group_A = ['저는', '저희는', '우리는']
Group_B = ['입니다','라고 합니다']
strings = [ '저는 학생입니다.', '저희는 회사원들 입니다.' , '이 것이 현실 입니다.', '우리는 배고파요.' , '우리는 밴디스트라고 합니다.']
#Get all possible combinations of words from the group
z = list(it.product(Group_A, Group_B))
results = []
#Run through the list of string
for s in strings:
flag = False
for item in z:
#If the word is present in the string, flag is True
if item[0] in s and item[1] in s:
flag = True
break
#Append result to results string
results.append(flag)
print(results)
结果将看起来像
[True, True, False, False, True]
此外,下面的输入内容
Group_A = ['thing']
Group_B = ['car']
strings = ['there is a thing in a car', 'Nothing is in a car','Something happens to my car']
值将为[True, True, True]
答案 2 :(得分:1)
Group_A = ['nice','car','by','shop']
Group_B = ['no','thing','great']
from collections import defaultdict
group_a=defaultdict(int)
group_b=defaultdict(int)
for i in Group_A:
group_a[i]=1
for i in Group_B:
group_b[i]=1
t_string_A = 'there is a car over there'
t_string_B = 'no one is in a car'
def fun2(string):
l=[]
past=0
for i in range(len(string)):
if string[i]==' ':
if string[past:i]!='':
l.append(string[past:i])
past=i+1
return l
def fun(string,dic):
for i in fun2(string):
# for i in string.split():
try:
if dic[i]:
return 1
except:
pass
return 0
if fun(t_string_A,group_a)==fun(t_string_B,group_b):
print(1)
else:
print(0)
答案 3 :(得分:0)
这是一种高效的字典匹配算法,可在O(p + q + r)
中同时定位文本中的模式,其中p
=模式的长度,q
=文本的长度,r
=返回的匹配项的长度。
您可能想同时运行两个单独的状态机,并且需要对其进行修改,以便它们在第一个匹配项时终止。
我从this python implementation开始对修改进行了尝试
class AhoNode(object):
def __init__(self):
self.goto = {}
self.is_match = False
self.fail = None
def aho_create_forest(patterns):
root = AhoNode()
for path in patterns:
node = root
for symbol in path:
node = node.goto.setdefault(symbol, AhoNode())
node.is_match = True
return root
def aho_create_statemachine(patterns):
root = aho_create_forest(patterns)
queue = []
for node in root.goto.itervalues():
queue.append(node)
node.fail = root
while queue:
rnode = queue.pop(0)
for key, unode in rnode.goto.iteritems():
queue.append(unode)
fnode = rnode.fail
while fnode is not None and key not in fnode.goto:
fnode = fnode.fail
unode.fail = fnode.goto[key] if fnode else root
unode.is_match = unode.is_match or unode.fail.is_match
return root
def aho_any_match(s, root):
node = root
for i, c in enumerate(s):
while node is not None and c not in node.goto:
node = node.fail
if node is None:
node = root
continue
node = node.goto[c]
if node.out:
return True
return False
def all_any_matcher(*pattern_lists):
''' Returns an efficient matcher function that takes a string
and returns True if at least one pattern from each pattern list
is found in it.
'''
machines = [aho_create_statemachine(patterns) for patterns in pattern_lists]
def matcher(text):
return all(aho_any_match(text, m) for m in machines)
return matcher
并使用它
patterns_a = ['nice','car','by','shop']
patterns_b = ['no','thing','great']
matcher = all_any_matcher(patterns_a, patterns_b)
text_1 = 'there is a car over there'
text_2 = 'no one is in a car'
for text in (text_1, text_2):
print '%r - %s' % (text, matcher(text))
显示
'there is a car over there' - False
'no one is in a car' - True
答案 4 :(得分:0)
您可以遍历单词,查看其中是否有in
字符串:
from typing import List
def has_word(string: str, words: List[str]) -> bool:
for word in words:
if word in string:
return True
return False
可以轻松修改此功能,使其也具有has_all_words
。