我正在运行此程序以查找特定文本中的字符分布。
# this is a paragraph from python documentation :)
mytext = 'When a letter is first k encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero. The increment operation then builds up the count for each letter.The function int() which always returns zero is just a special case of constant functions. A faster and more flexible way to create constant functions is to use a lambda function which can supply any constant value (not just zero):'
d = dict()
ignorelist = ('(',')',' ', ',', '.', ':', '_')
for n in mytext:
if(n not in ignorelist):
n = n.lower()
if n in d.keys():
d[n] = d[n] + 1
else:
d[n] = 1
xx = list(d.keys())
yy = list(d.values())
import matplotlib.pyplot as plt
plt.scatter(xx,yy, marker = '*')
plt.show()
答案 0 :(得分:4)
请注意,这将从matplotlib版本2.2
开始修复您似乎在matplotlib 2.1的新分类功能中发现了一个错误。对于单字母类别,它显然会将其功能限制为10个项目。如果类别包含更多字母,则不会发生这种情况。
在任何情况下,解决方案都是绘制数值(就像在matplotlib 2.1之前需要做的那样)。然后将ticklabels设置为类别。
# this is a paragraph from python documentation :)
mytext = 'When a letter is first k encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero. The increment operation then builds up the count for each letter.The function int() which always returns zero is just a special case of constant functions. A faster and more flexible way to create constant functions is to use a lambda function which can supply any constant value (not just zero):'
d = dict()
ignorelist = ('(',')',' ', ',', '.', ':', '_')
for n in mytext:
if(n not in ignorelist):
n = n.lower()
if n in d.keys():
d[n] = d[n] + 1
else:
d[n] = 1
xx,yy = zip(*d.items())
import numpy as np
import matplotlib.pyplot as plt
xx_sorted, order = np.unique(xx, return_inverse=True)
plt.scatter(order,yy, marker="o")
plt.xticks(range(len(xx)), xx_sorted)
plt.show()