Question

我发现此python代码可对文本文件执行词干搜索。

import nltk
import string
from collections import Counter


def get_tokens():
    with open('/Users/MYUSERNAME/Desktop/Test_sp500/A_09.txt', 'r') as shakes:
        text = shakes.read()
        lowers = text.lower()
        no_punctuation = lowers.translate(None,string.punctuation)
        tokens = nltk.word_tokenize(no_punctuation)
        return tokens


tokens = get_tokens()
count = Counter(tokens)
print
count.most_common(10)

from nltk.corpus import stopwords

tokens = get_tokens()
filtered = [w for w in tokens if not w in stopwords.words('english')]
count = Counter(filtered)
print
count.most_common(100)

from nltk.stem.porter import *


def stem_tokens(tokens, stemmer):
    stemmed = []
    for item in tokens:
        stemmed.append(stemmer.stem(item))
    return stemmed


stemmer = PorterStemmer()
stemmed = stem_tokens(filtered, stemmer)
count = Counter(stemmed)
print
count.most_common(100)

当我尝试运行该程序时，出现以下错误：

Traceback (most recent call last):
  File "/Users/MYUSERNAME/Desktop/stemmer.py", line 15, in <module>
    tokens = get_tokens()
  File "/Users/MYUSERNAME/Desktop/stemmer.py", line 10, in get_tokens
    no_punctuation = lowers.translate(None,string.punctuation)
TypeError: translate() takes exactly one argument (2 given)

现在我的问题是：

我该如何解决？
当该程序运行时，如何不仅针对一个.txt文件运行该脚本，而且还要针对某个目录中的所有.txt文件运行该脚本？

注意：我通常不必编程，因此我只知道绝对的Python基础知识。

Answer 1

我认为您使用的是Python版本> = 3。

在Python 2.7中，功能translate take 2 arguments在Python 3及更高版本中takes only 1 argument。从本质上讲，这就是为什么您会遇到错误。

我不确定您要使用None参数做什么，因为在Python 2.7中它毫无意义，您基本上是在尝试将string.punctuation转换为{ {1}}。

相反，您需要make a translation table，然后将其传递给翻译函数。

None

TypeError：translate（）恰好接受1个参数（给定2个）Python

1 个答案: