如何删除数据框熊猫中的相同和稀有单词?

时间:2018-09-20 05:50:39

标签: python pandas text-analysis

如何在数据框df3中删除相同的单词? 我的以下代码似乎不起作用...

let myPromise = new Promise((resolve, reject) => {
  resolve("Foo");
});

myPromise.catch((value) => {
  console.log('inside catch');
}).then((value) => {
  console.log(value);
});

以上示例输出为:

 df3 = pd.DataFrame(np.array(c3), columns=["content"]).drop_duplicates()

 def text_processing_cat3(df3):
''=== Removal of common words ==='''
    freq = pd.Series(' '.join(df3['content']).split()).value_counts()[:10]
    freq = list(freq.index)
    df3['content'] = df3['content'].apply(lambda x: " ".join(x for x in 
    x.split() if x not in freq))

'''=== Removal of rare words ==='''
freq = pd.Series(' '.join(df3['content']).split()).value_counts()[-10:]
freq = list(freq.index)
df3['content'] = df3['content'].apply(lambda x: " ".join(x for x in 
x.split() if x not in freq))


 return df3

  print(text_processing_cat3(df3)

请帮助检查代码并改进上面的代码。谢谢!!

0 个答案:

没有答案