python人气云无法正常工作

时间:2010-10-23 13:29:21

标签: python tag-cloud

我构建了一个受欢迎的云,但它无法正常运行。 txt文件是;

1 Top Gear
3 Scrubs
3 The Office (US)
5 Heroes
5 How I Met Your Mother
5 Legend of the Seeker
5 Scrubs
.....

在我的受欢迎云中,名称会写入频率时间。例如,搜索者的传奇被写了5次并且他们的大小增加了。每个单词应该写一次,大小必须根据受欢迎程度(5)。但每一个字应该写一次,其大小必须根据其受欢迎程度。我该如何解决?

我的程序也应该提供这样的条件:

具有相同频率的术语通常以相同的颜色显示,例如高尔夫和空手道。不同的频率通常以不同的颜色显示,例如篮球,板球和曲棍球。在每个云输出的底部,用于显示云中值的颜色的频率/计数。

我的代码如下。

#!/usr/bin/python
import string

def main():
    # get the list of tags and their frequency from input file
    taglist = getTagListSortedByFrequency('tv.txt')
    # find max and min frequency
    ranges = getRanges(taglist)
    # write out results to output, tags are written out alphabetically
    # with size indicating the relative frequency of their occurence
    writeCloud(taglist, ranges, 'tv.html')

def getTagListSortedByFrequency(inputfile):
    inputf = open(inputfile, 'r')
    taglist = []
    while (True):
        line = inputf.readline()[:-1]
        if (line == ''):
            break
        (count, tag) = line.split(None, 1)
        taglist.append((tag, int(count)))
    inputf.close()
    # sort tagdict by count
    taglist.sort(lambda x, y: cmp(x[1], y[1]))
    return taglist

def getRanges(taglist):
    mincount = taglist[0][1]
    maxcount = taglist[len(taglist) - 1][1]
    distrib = (maxcount - mincount) / 4;
    index = mincount
    ranges = []
    while (index <= maxcount):
        range = (index, index + distrib-1)
        index = index + distrib
        ranges.append(range)
    return ranges

def writeCloud(taglist, ranges, outputfile):
    outputf = open(outputfile, 'w')
    outputf.write("<style type=\"text/css\">\n")
    outputf.write(".smallestTag {font-size: xx-small;}\n")
    outputf.write(".smallTag {font-size: small;}\n")
    outputf.write(".mediumTag {font-size: medium;}\n")
    outputf.write(".largeTag {font-size: large;}\n")
    outputf.write(".largestTag {font-size: xx-large;}\n")
    outputf.write("</style>\n")
    rangeStyle = ["smallestTag", "smallTag", "mediumTag", "largeTag", "largestTag"]
    # resort the tags alphabetically
    taglist.sort(lambda x, y: cmp(x[0], y[0]))
    for tag in taglist:
        rangeIndex = 0
        for range in ranges:
            url = "http://www.google.com/search?q=" + tag[0].replace(' ', '+') + "+site%3Asujitpal.blogspot.com"
            if (tag[1] >= range[0] and tag[1] <= range[1]):
                outputf.write("<span class=\"" + rangeStyle[rangeIndex] + "\"><a href=\"" + url + "\">" + tag[0] + "</a></span> ")
                break
            rangeIndex = rangeIndex + 1
    outputf.close()

if __name__ == "__main__":
    main()

1 个答案:

答案 0 :(得分:0)

我不确定这可以按颜色分类,但只需要4行代码就可以在运行代码的目录中生成标签云。 https://github.com/atizo/PyTagCloud