Question

我有大量的电影对话数据：康奈尔电影对话。我正在尝试将它们转换为小写字母

    def clean_text(text):
    text = text.lower()
    text = re.sub(r"i'm", "i am", text)
    text = re.sub(r"he's", "he is", text)
    text = re.sub(r"she's", "she is", text)
    text = re.sub(r"that's", "that is", text)
    text = re.sub(r"what's", "what is", text)
    text = re.sub(r"where's", "where is", text)
    text = re.sub(r"\'re", "are", text)
    text = re.sub(r"\'d", "would", text)
    text = re.sub(r"won't", "will not", text)
    text = re.sub(r"can't", "cannot", text)
    text = re.sub(r"[-()\"#/@;:<>{}+=-|.?,]", "", text)
    text = re.sub(r"\'ll", "will", text)
    text = re.sub(r"\'ve", "have", text)
    return text

#cleaning the questions
clean_questions = []
for question in questions:
    clean_questions.append(clean_text(question))

我得到以下回调：

Traceback (most recent call last):

  File "<ipython-input-17-4733a5502cb3>", line 3, in <module>
    clean_questions.append(clean_text(question))

  File "<ipython-input-16-a9a9890808b2>", line 2, in clean_text
    text = text.lower()

AttributeError: 'list' object has no attribute 'lower'

关于我在做什么错以及如何纠正它的任何建议将不胜感激！！！谢谢！！

Answer 1

Str.lower（）不适用于整个列表，但不适用于单独的字符串。您可以通过执行以下操作来解决此问题：

for i in range(len(text)): text[i].lower()

Answer 2

Lower（）是一个字符串方法，不适用于列表。使用循环遍历列表，并降低（）每个元素

Answer 3

无论如何，我什至没有回答questions列表：

取决于列表，如果它是嵌套列表（我是因为出错），即：

l=[['ABC','XYZ'],'BlA']

它将不起作用，因为有列表作为元素，因此将其展平：

x=[]
for i in l:
    if type(i) is list:
        x.extend(i)
    else:
        x.append(i)

现在：

print(x)

是：

['ABC','XYZ','BlA']

然后您的x上的代码可以使用了，OTOH：

['A','B','C']

意味着所有字符串，它也将按预期工作。

下半部字母出现错误

3 个答案: