非英语语料库的wordcloud

时间:2017-03-23 17:19:24

标签: python-3.x utf-8 nlp jupyter-notebook word-cloud

wordcloud for non English text

亲爱的朋友们 我在为非英语文本生成正确的wordcloud时遇到问题。生成云但它会产生不令人满意的结果。它显示wordcloud只有字符,而我需要wordcloud与适当的单词。 我处理了以下代码以生成wordcloud。

from os import path
from scipy.misc import imread
import matplotlib.pyplot as plt
import random
import unicodedata
from wordcloud import WordCloud, STOPWORDS
text = scorpus
wordcloud = WordCloud(font_path='MBKhursheed.ttf',
                      relative_scaling = 1.0,
                      stopwords = sw
                      ).generate(text)
plt.imshow(wordcloud)
plt.axis("off")
plt.show()

1 个答案:

答案 0 :(得分:0)

首先,您需要导入(可能先安装)这两个:

from arabic_reshaper import arabic_reshaper
from bidi.algorithm import get_display

然后将其用作以下内容:

text = get_display(arabic_reshaper.reshape(text))
wordcloud = WordCloud(font_path='MBKhursheed.ttf',
                      relative_scaling = 1.0,
                      stopwords = sw
                      ).generate(text)