有没有一种方法可以简化这堆if语句?这个解析函数肯定可以工作(使用正确的字典),但是它必须为输入中的每个单词测试6个if语句。对于一个5字的句子,它将是30个if语句。这也很难读。
def parse(text):
predicate=False
directObjectAdjective=False
directObject=False
preposition=False
indirectObjectAdjective=False
indirectObject=False
text=text.casefold()
text=text.split()
for word in text:
if not predicate:
if word in predicateDict:
predicate=predicateDict[word]
continue
if not directObjectAdjective:
if word in adjectiveDict:
directObjectAdjective=adjectiveDict[word]
continue
if not directObject:
if word in objectDict:
directObject=objectDict[word]
continue
if not preposition:
if word in prepositionDict:
preposition=prepositionDict[word]
continue
if not indirectObjectAdjective:
if word in adjectiveDict:
indirectObjectAdjective=adjectiveDict[word]
continue
if not indirectObject:
if word in objectDict:
indirectObject=objectDict[word]
continue
if not directObject and directObjectAdjective:
directObject=directObjectAdjective
directObjectAdjective=False
if not indirectObject and indirectObjectAdjective:
indirectObject=indirectObjectAdjective
indirectObjectAdjective=False
return [predicate,directObjectAdjective,directObject,preposition,indirectObjectAdjective,indirectObject]
如果需要的话,这里也是字典的示例。
predicateDict={
"grab":"take",
"pick":"take",
"collect":"take",
"acquire":"take",
"snag":"take",
"gather":"take",
"attain":"take",
"capture":"take",
"take":"take"}
答案 0 :(得分:2)
与堆栈溢出相比,这更多的是代码审查问题。一个主要的问题是,您拥有保存在单独变量中的相似数据。如果结合使用变量,则可以对其进行迭代。
missing_parts_of_speech = ["predicate", [...]]
dict_look_up = {"predicate":predicateDict,
[...]
}
found_parts_of_speech = {}
for word in text:
for part in missing_parts_of_speech:
if word in dict_look_up[part]:
found_parts_of_speech[part] = dict_look_up[part][word]
missing_parts_of_speech.remove(part)
continue
答案 1 :(得分:1)
我建议只使用方法dict.get
。此方法具有可选参数default
。通过传递此参数,可以避免使用KeyError
。如果字典中不存在该键,则将返回默认值。
如果将先前分配的变量用作默认值,则不会被任意值代替,而是正确的值。例如,如果当前单词是“谓词”,则“直接对象”将被已存储在变量中的值替换。
代码
def parse(text):
predicate = False
directObjectAdjective = False
directObject = False
preposition = False
indirectObjectAdjective = False
indirectObject = False
text=text.casefold()
text=text.split()
for word in text:
predicate = predicateDict.get(word, predicate)
directObjectAdjective = adjectiveDict.get(word, directObjectAdjective)
directObject = objectDict.get(word, directObject)
preposition = prepositionDict.get(word, preposition)
indirectObjectAdjective = adjectiveDict.get(word, indirectObjectAdjective)
indirectObject = objectDict.get(word, indirectObject)
if not directObject and directObjectAdjective:
directObject = directObjectAdjective
directObjectAdjective = False
if not indirectObject and indirectObjectAdjective:
indirectObject = indirectObjectAdjective
indirectObjectAdjective = False
return [predicate, directObjectAdjective, directObject, preposition, indirectObjectAdjective, indirectObject]
PS:使用更多空格。读者将感谢您...
PPS:我没有测试过,因为我手边没有这样的词典。
PPPS:这将始终返回文本中这些类型的 last 次出现,而您的实现将始终返回 first 次出现。
答案 2 :(得分:1)
您可以将不同种类的词(如字符串)映射到字典中,以在其中找到这些词,然后仅检查尚未找到的那些词,并查看它们是否在那些字典中。
needed = {"predicate": predicateDict,
"directObjectAdjective": adjectiveDict,
"directObject": objectDict,
"preposition": prepositionDict,
"indirectObjectAdjective": adjectiveDict,
"indirectObject": objectDict}
for word in text:
for kind in needed:
if isinstance(needed[kind], dict) and word in needed[kind]:
needed[kind] = needed[kind][word]
continue
最后(在执行过程的每个步骤中)找到了needed
中所有没有dict
作为值的项目,并用它们各自的{{ 1}}。
(回想起来,使用两个字典,或一个字典和一组字典可能更有意义:一个是该单词的最终值,另一个是是否已经找到它们。可能有点更容易掌握。)
答案 3 :(得分:0)
我建议您使用一种新的模式来代替以前的代码来编写此代码。新模式有9行,剩下9行-只需向D添加更多字典。旧模式已经有11行,并且将增加4行,每增加一个字典就可以测试一次。
aDict = { "a1" : "aa1", "a2" : "aa1" }
bDict = { "b1" : "bb1", "b2" : "bb2" }
text = ["a1", "b2", "a2", "b1"]
# old pattern
a = False
b = False
for word in text:
if not a:
if word in aDict:
a = aDict[word]
continue
if not b:
if word in bDict:
b = bDict[word]
continue
print(a, b)
# new pattern
D = [ aDict, bDict]
A = [ False for _ in D]
for word in text:
for i, a in enumerate(A):
if not a:
if word in D[i]:
A[i] = D[i][word]
continue
print(A)