以奇怪的方式迭代.txt文件

时间:2016-04-25 00:15:10

标签: list python-3.x dictionary iteration

我要做的是编写一个打开带有电影评论的.txt文件的程序,其中评级为0-4之间的数字,然后是电影的简短评论。然后,程序提示用户打开第二个文本文件,其中包含将与评论匹配的单词,并根据评论给出一个数字值。

例如,通过这两个示例评论它们将如何显示在.txt文件中:

4这是一部近乎史诗般的喜剧剧,其根源在于正在经历中年危机的头衔。 2马苏德的故事是一部史诗般的故事,也是一部悲剧,是一位顽强,人道的战士,也是囚犯-LRB-并最终成为历史的受害者-RRB-的记录。

所以,如果我正在寻找单词" epic",它会将该单词的计数增加2(我已经想到了),因为它出现了两次,然后追加值4和2到该单词的评级列表。

如何将这些内容附加到与该单词相关的列表或词典中?请记住,我需要为单词列表中的每个单词创建一个新列表或dicitonary key。

请谢谢。对不起,如果措辞不好,编程就不是我的强项。

我的所有代码:

def menu_validate(prompt, min_val, max_val):
    """ produces a prompt, gets input, validates the input and returns a value. """
    while True:
        try:
            menu = int(input(prompt))
            if menu >= min_val and menu <= max_val:
                return menu
                break
            elif menu.lower == "quit" or menu.lower == "q":
                quit()
            print("You must enter a number value from {} to {}.".format(min_val, max_val))
        except ValueError:
            print("You must enter a number value from {} to {}.".format(min_val, max_val))

def open_file(prompt):
    """ opens a file """
    while True:
        try:
            file_name = str(input(prompt))
            if ".txt" in file_name:
                input_file = open(file_name, 'r')
                return input_file
            else:
                input_file = open(file_name+".txt", 'r')
                return input_file
        except FileNotFoundError:
            print("You must enter a valid file name. Make sure the file you would like to open is in this programs root folder.")

def make_list(file):
    lst = []
    for line in file:
        lst2 = line.split(' ')
        del lst2[-1]
        lst.append(lst2)
    return lst

def rating_list(lst):
    '''iterates through a list of lists and appends the first value in each list to a second list'''
    rating_list = []
    for list in lst:
        rating_list.append(list[0])
    return rating_list

def word_cnt(lst, word : str):
    cnt = 0
    for list in lst:
        for word in list:
            cnt += 1
    return cnt

def words_list(file):
    lst = []
    for word in file:
        lst.append(word)
    return lst

##def sort(words, occurrences, avg_scores, std_dev):
##    '''sorts and prints the output'''
##    menu = menu_validate("You must choose one of the valid choices of 1, 2, 3, 4 \n        Sort Options\n    1. Sort by Avg Ascending\n    2. Sort by Avg Descending\n    3. Sort by Std Deviation Ascending\n    4. Sort by Std Deviation Descending", 1, 4)
##    print ("{}{}{}{}\n{}".format("Word", "Occurence", "Avg. Score", "Std. Dev.", "="*51))
##    if menu == 1:
##        for i in range (len(word_list)):
##            print ("{}{}{}{}".format(cnt_list.sorted[i],)

def make_odict(lst1, lst2):
    '''makes an ordered dictionary of keys/values from 2 lists of equal length'''

    dic = OrderedDict()

    for i in range (len(word_list)):
        dic[lst2[i]] = lst2[i]

    return dic        


cnt_list = []
while True:
    menu = menu_validate("1. Get sentiment for all words in a file? \nQ. Quit \n", 1, 1)
    if menu == True:
        ratings_file = open("sample.txt")
        ratings_list = make_list(ratings_file)


        word_file = open_file("Enter the name of the file with words to score \n")
        word_list = words_list(word_file)
        for word in word_list:
            cnt = word_cnt(ratings_list, word)
            cnt_list.append(word_cnt(ratings_list, word))

对不起,我知道它很乱,而且非常不完整。

1 个答案:

答案 0 :(得分:1)

我认为你的意思是:

import collections

counts = collections.defaultdict(int)

word = 'epic'

counts[word] += 1

显然,你可以使用word做比我更多的事情,但你没有向我们展示任何代码,所以......

修改

好的,看看你的代码,我建议你将评级和文字分开。拿这个:

def make_list(file):
    lst = []
    for line in file:
        lst2 = line.split(' ')
        del lst2[-1]
        lst.append(lst2)
    return lst

并将其转换为:

def parse_ratings(file):
    """
    Given a file of lines, each with a numeric rating at the start,
    parse the lines into score/text tuples, one per line. Return the
    list of parsed tuples.
    """
    ratings = []
    for line in file:
        text = line.strip().split()
        if text:
            score = text[0]
            ratings.append((score,text[1:]))
    return ratings

然后你可以一起计算两个值:

def match_reviews(word, ratings):
    cnt = 0
    scores = []

    for score,text in ratings:
        n = text.count(word)
        if n:
            cnt += n
            scores.append(score)

    return (cnt, scores)