给出2个句子,以及一个带有同义词的单词映射,我想编写一个程序来确定这些句子是否相似。如果2个句子的单词数相同,并且每个对应的单词都是同义词,则它们是相似的。确保处理同义词之间的对称关系和传递关系。例如。同义词图是:
[(“a”, “b”), (“a”, “c”), (“a”, “d”), (“b”, “e”), (“f”, “e”), (“g”, “h”)]
然后句子“a e g”
和“f c h”
是同义词。示例:
Input S1: “a e g” S2: “f c h”
Map: [(“a”, “b”), (“a”, “c”), (“a”, “d”), (“b”, “e”), (“f”, “e”), (“g”, “h”)]
Output: True
说明:“a”
和“f”
是同义词,因为“a”
和“b”
是同义词,“f”
和“e”
是同义词,{{1 }}和“b”
是同义词。同样,“e”
和“c”
是同义词,“e”
和“g”
是同义词。
我已经尝试过这组代码:
“h”
我无法考虑比较两个字符串的逻辑
答案 0 :(得分:0)
这是一种可能性。由于您拥有equivalence relation,因此可以想到由其形成的equivalence classes。基本上,您可以创建集合,以使彼此同义词的所有单词都在一起。或者,如果您更愿意从图的角度来考虑,则可以将元组视为无向图的边并找到connected components。
您可以这样做:
def make_word_classes(synom):
# List of classes
word_classes = []
for w1, w2 in synom:
# Go through existing classes
for wcls in word_classes:
# If one of the words is already in the current class
if w1 in wcls or w2 in wcls:
# Add both words and continue to next pair of words
wcls.add(w1)
wcls.add(w2)
break
else: # Note this else goes with the for loop, not the if block
# If there was no matching class, add a new one
word_classes.append({w1, w2})
return word_classes
synom = [("a", "b"), ("a", "c"), ("a", "d"), ("b", "e"), ("f", "e"), ("g", "h")]
word_classes = make_word_classes(synom)
print(word_classes)
# [{'a', 'c', 'b', 'd', 'f', 'e'}, {'h', 'g'}]
这样,很容易看出两个句子是否相等。您只需要检查每对单词是否相等或属于同一对等类:
def sentences_are_equivalent(s1, s2, word_classes):
# Split into words
l1 = s1.split()
l2 = s2.split()
# If they have different sizes they are different
if len(l1) != len(l2):
return False
# Go through each pair of corresponding words
for w1, w2 in zip(l1, l2):
# If it is the same word then it is okay
if w1 == w2:
continue
# Go through list of word classes
for wcls in word_classes:
# If both words are in the same class it is okay
if w1 in wcls and w2 in wcls:
# Continue to next pair of words
break
else: # Again, this else goes with the for loop
# If no class contains the pair of words
return False
return True
s1 = "a e g"
s2 = "f c h"
print(sentences_are_equivalent(s1, s2, word_classes))
# True