定义的函数有一个意外的关键字参数

时间:2017-01-07 21:31:13

标签: python python-2.7

我对这条线路有疑问:

processed = process(cleaned, lemmatizer=nltk.stem.wordnet.WordNetLemmatizer());

为什么会弹出意外的关键字参数?

Error: TypeError: process() got an unexpected keyword argument 'lemmatizer'

这是我的代码:

def process(text, filters=nltk.corpus.stopwords.words('english')):
""" Normalizes case and handles punctuation
Inputs:
    text: str: raw text
    lemmatizer: an instance of a class implementing the lemmatize() method
                (the default argument is of type nltk.stem.wordnet.WordNetLemmatizer)
Outputs:
    list(str): tokenized text
"""
lemmatizer=nltk.stem.wordnet.WordNetLemmatizer()
word_list = nltk.word_tokenize(text);

lemma_list = [];
for i in word_list:
    if i not in filters:
        try:
            lemma = lemmatizer.lemmatize(i);
            lemma_list.append(str(lemma));
        except:
            pass
return " ".join(lemma_list)


if __name__ == '__main__':
#construct filter for processor
file = open("accountant.txt").read().lower()
filters = set(nltk.word_tokenize(file))
filters.update(nltk.corpus.stopwords.words('english'))
filters = list(filters)

#webcrawling
webContent = []
dataJobs = pd.read_csv("test.csv");
webContent = []
for i in dataJobs["url"]:
    content = webCrawl(i);
    webContent.append(content);

#clean the crawled text
cleaned_list = []
for j in webContent:
        cleaned = extractUseful(j);
        processed = process(cleaned, lemmatizer=nltk.stem.wordnet.WordNetLemmatizer());
        cleaned_list.append(processed)

#save to csv
contents = pd.DataFrame({ "Content":webContent, "Cleaned": cleaned_list})
contents.to_csv("testwebcrawled.csv")


dataJobs[['jd']]= cleaned_list
dataJobs.to_csv("test_v2_crawled.csv")

1 个答案:

答案 0 :(得分:0)

您只能在filtersprocess行)的函数签名中定义一个关键字参数def process(...)。如果你想要通过过滤器尝试使用过滤器,请尝试:

processed = process(cleaned, filter=nltk.stem.wordnet.WordNetLemmatizer())

如果你也希望能够传递一个词形变换器,你应该将你的功能签名更改为:

def process(text, 
            filters=nltk.corpus.stopwords.words('english'),
            lemmatizer=nltk.stem.wordnet.WordNetLemmatizer()):

但是请注意,如果希望将=之后的值作为这些参数的默认参数传递,则只需要=及其函数签名后面的内容。否则,您可以这样做:

def process(text, filter, lemmatizer):
    ...

并称之为:

processed = process(cleaned,
                    filter=nltk.corpus.stopwords.words('english'),
                    lemmatizer=nltk.stem.wordnet.WordNetLemmatizer())