生成基于文本的直方图

时间:2018-08-01 07:49:00

标签: python dictionary histogram

我目前有一些代码,可以打印出文件中每个单词的出现频率。我该如何修改它以生成一个直方图,显示每个单词的值的百分比。

use Flow

def run([stream]) do
  task = Task.async(fn ->

    specs = [{{ProdCon,[]},[]}]
    consumer = [{{Consumer,[]},[]}]

    stream
    |> Flow.from_enumerable()
    |> Flow.through_specs(specs)
    |> Flow.into_specs(consumer)

  end)

  case Task.yield(task, 3_600) do # wait 1 hour
    {:ok, result} -> result
    nil -> IO.puts("Failed to get a result :(")
  end
end

2 个答案:

答案 0 :(得分:0)

使用dict理解和简单除法的幼稚方法:

conda uninstall pandas

conda install pandas

答案 1 :(得分:0)

此脚本创建一个类似的字典,类似于您创建的字典,而不是单词计数,而是以百分比作为值。希望这会有所帮助:)

from collections import Counter
data = open( 'test.txt' ).read()  # read the file
data = ''.join( [i.upper() if i.isalpha() else ' ' for i in data] )   # remove the punctuation
c = Counter( data.split() )   # count the words
print(c)

values_list = c.values()
word_sum = 0

for v in values_list:
    word_sum += v # get the number of words in the file

percent_dict = {}
for k, v in c.items():
    percentage = (100*v)/word_sum
    percent_dict[k] = percentage

    print(percent_dict)