Question

有没有一种方法可以简化这堆if语句？这个解析函数肯定可以工作（使用正确的字典），但是它必须为输入中的每个单词测试6个if语句。对于一个5字的句子，它将是30个if语句。这也很难读。

def parse(text):
    predicate=False
    directObjectAdjective=False
    directObject=False
    preposition=False
    indirectObjectAdjective=False
    indirectObject=False
    text=text.casefold()
    text=text.split()
    for word in text:
        if not predicate:
            if word in predicateDict:
                predicate=predicateDict[word]
                continue
        if not directObjectAdjective:
            if word in adjectiveDict:
                directObjectAdjective=adjectiveDict[word]
                continue
        if not directObject:
            if word in objectDict:
                directObject=objectDict[word]
                continue
        if not preposition:
            if word in prepositionDict:
                preposition=prepositionDict[word]
                continue
        if not indirectObjectAdjective:
            if word in adjectiveDict:
                indirectObjectAdjective=adjectiveDict[word]
                continue
        if not indirectObject:
            if word in objectDict:
                indirectObject=objectDict[word]
                continue
    if not directObject and directObjectAdjective:
        directObject=directObjectAdjective
        directObjectAdjective=False
    if not indirectObject and indirectObjectAdjective:
        indirectObject=indirectObjectAdjective
        indirectObjectAdjective=False
    return [predicate,directObjectAdjective,directObject,preposition,indirectObjectAdjective,indirectObject]

如果需要的话，这里也是字典的示例。

predicateDict={
"grab":"take",
"pick":"take",
"collect":"take",
"acquire":"take",
"snag":"take",
"gather":"take",
"attain":"take",
"capture":"take",
"take":"take"}

Answer 1

与堆栈溢出相比，这更多的是代码审查问题。一个主要的问题是，您拥有保存在单独变量中的相似数据。如果结合使用变量，则可以对其进行迭代。

missing_parts_of_speech = ["predicate", [...]]
dict_look_up = {"predicate":predicateDict,
           [...]           
        }    
found_parts_of_speech = {}    
for word in text:
    for part in missing_parts_of_speech:
        if word in dict_look_up[part]:
            found_parts_of_speech[part] = dict_look_up[part][word]
            missing_parts_of_speech.remove(part)
            continue

Answer 2

我建议只使用方法dict.get。此方法具有可选参数default。通过传递此参数，可以避免使用KeyError。如果字典中不存在该键，则将返回默认值。

如果将先前分配的变量用作默认值，则不会被任意值代替，而是正确的值。例如，如果当前单词是“谓词”，则“直接对象”将被已存储在变量中的值替换。

代码

def parse(text):
    predicate = False
    directObjectAdjective = False
    directObject = False
    preposition = False
    indirectObjectAdjective = False
    indirectObject = False

    text=text.casefold()
    text=text.split()
    for word in text:
        predicate = predicateDict.get(word, predicate)
        directObjectAdjective = adjectiveDict.get(word, directObjectAdjective)
        directObject = objectDict.get(word, directObject)
        preposition = prepositionDict.get(word, preposition)
        indirectObjectAdjective = adjectiveDict.get(word, indirectObjectAdjective)
        indirectObject = objectDict.get(word, indirectObject)

    if not directObject and directObjectAdjective:
        directObject = directObjectAdjective
        directObjectAdjective = False

    if not indirectObject and indirectObjectAdjective:
        indirectObject = indirectObjectAdjective
        indirectObjectAdjective = False

    return [predicate, directObjectAdjective, directObject, preposition, indirectObjectAdjective, indirectObject]

PS：使用更多空格。读者将感谢您...

PPS：我没有测试过，因为我手边没有这样的词典。

PPPS：这将始终返回文本中这些类型的 last 次出现，而您的实现将始终返回 first 次出现。

Answer 3

您可以将不同种类的词（如字符串）映射到字典中，以在其中找到这些词，然后仅检查尚未找到的那些词，并查看它们是否在那些字典中。

needed = {"predicate": predicateDict,
          "directObjectAdjective": adjectiveDict,
          "directObject": objectDict,
          "preposition": prepositionDict,
          "indirectObjectAdjective": adjectiveDict,
          "indirectObject": objectDict}

for word in text:
    for kind in needed:
        if isinstance(needed[kind], dict) and word in needed[kind]:
            needed[kind] = needed[kind][word]
            continue

最后（在执行过程的每个步骤中）找到了needed中所有没有dict作为值的项目，并用它们各自的{{ 1}}。

（回想起来，使用两个字典，或一个字典和一组字典可能更有意义：一个是该单词的最终值，另一个是是否已经找到它们。可能有点更容易掌握。）

Answer 4

我建议您使用一种新的模式来代替以前的代码来编写此代码。新模式有9行，剩下9行-只需向D添加更多字典。旧模式已经有11行，并且将增加4行，每增加一个字典就可以测试一次。

aDict = { "a1" : "aa1", "a2" : "aa1" }
bDict = { "b1" : "bb1", "b2" : "bb2" }
text = ["a1", "b2", "a2", "b1"]
# old pattern
a = False
b = False
for word in text:
    if not a:
        if word in aDict:
            a = aDict[word]
            continue
    if not b:
        if word in bDict:
            b = bDict[word]
            continue
print(a, b)
# new pattern
D = [ aDict, bDict]
A = [ False for _ in D]
for word in text:
    for i, a in enumerate(A):
        if not a:
            if word in D[i]:
                A[i] = D[i][word]
                continue
print(A)

简化许多if语句

4 个答案: