Question

我试图修改一个包含嵌套列表的程序，然后返回一个新的列表，取出专有名词。

以下是一个例子：

L = [['The', 'name', 'is', 'James'], ['Where', 'is', 'the', 'treasure'], ['Bond', 'cackled', 'insanely']]

我想回来：

['the', 'name', 'is', 'is', 'the', 'tresure', 'cackled', 'insanely']

请注意＆＃39;其中＆＃39;被删除。没关系，因为它不会出现在嵌套列表中的任何其他位置。每个嵌套列表都是一个句子。我的方法是将嵌套列表中的每个第一个元素追加到newList。然后我比较以查看newList中的元素是否在嵌套列表中。我会小写要检查的newList中的元素。我已经完成了这个程序的一半，但是当我尝试在最后从newList中删除元素时，我遇到了错误。一旦我得到新的更新列表，我想删除newList中嵌套列表中的项目。我最后将嵌套列表中的所有项附加到newList并将它们小写。应该这样做。

如果有人采用更有效的方法，我很乐意倾听。

def lowerCaseFirst(L):
    newList = []
    for nestedList in L:
        newList.append(nestedList[0])
    print newList

    for firstWord in newList:
        sum = 0
        firstWord = firstWord.lower()
        for nestedList in L:
            for word in nestedList[1:]:
                if firstWord == word:
                    print "yes"

                    sum = sum + 1
            print newList
        if sum >= 1:
            firstWord = firstWord.upper()
            newList.remove(firstWord)
    return newList

请注意，由于倒数第二行中的错误，此代码未完成

这是newerList（updatedNewList）：

def lowerCaseFirst(L):
    newList = []
    for nestedList in L:
        newList.append(nestedList[0])
    print newList
    updatedNewList = newList
    for firstWord in newList:
        sum = 0
        firstWord = firstWord.lower()
        for nestedList in L:
            for word in nestedList[1:]:
                if firstWord == word:
                    print "yes"

                    sum = sum + 1
            print newList
        if sum >= 1:
            firstWord = firstWord.upper()
            updatedNewList.remove(firstWord)
    return updatedNewList

错误消息：

Traceback (most recent call last):
  File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 1, in <module>
    # Used internally for debug sandbox under external interpreter
  File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 80, in lowerCaseFirst
ValueError: list.remove(x): x not in list

Answer 1

第一个函数中的错误是因为您尝试从newlist中删除大写的firstWord版本，其中没有大写单词（您可以从打印输出中看到）。请记住，您将单词的高/低版本存储在新变量中，但不要更改原始列表的内容。

我仍然不了解你的方法。在描述任务时，你想做的事情; 1）将列表列表展平为元素列表（总是一个有趣的编程练习）和2）从该列表中删除专有名词。这意味着你必须决定什么是专有名词。您可以基本上这样做（所有非开始大写单词或详尽的列表），或者您可以使用POS标记器（请参阅：Finding Proper Nouns using NLTK WordNet）。除非我完全误解你的任务，否则你不必担心这里的外壳。

第一项任务可以通过多种方式解决。这是一个很好的方式，可以很好地说明在列表L是列表列表（而不是可以无限嵌套的列表）的简单情况下实际发生的情况：

def flatten(L):
  newList = []
  for sublist in L:
      for elm in sublist: 
          newList.append(elm)
  return newList

这个函数可以通过检查每个元素来制作flattenAndFilter（L）：

PN = [＆＃39;詹姆斯＆＃39;，＆＃39;邦德＆＃39;]

def flattenAndFilter(L):
  newList = []
  for sublist in L:
      for elm in sublist: 
          if not elm in PN:
              newList.append(elm)
  return newList

你可能没有这么好的PN列表，但是你必须扩展检查，例如通过解析句子和检查POS标签。

摆脱嵌套列表python中的专有名词

1 个答案: