应用错误收集

我是个新手，想知道是否有人可以帮我忙。我有一个大于20GB的大型文本数据集，并且需要/想要对一列进行定理。我当前的功能-直接与熊猫一起使用是

wnl = WordNetLemmatizer()

def lemmatizing(sentence):    
    stemSentence = ""

    for word in sentence.split():
        stem = wnl.lemmatize(word)
        stemSentence += stem
        stemSentence += " "

    stemSentence = stemSentence.strip()

    return stemSentence

通常会执行以下操作

df['news_content'] = df['news_content'].apply(lemmatizing)

我当时在看delayed，但对如何实现它感到困惑。

我们非常感谢您的帮助。

我想对dask数据帧进行定格处理，但是我被卡住了

0 个答案: