Question

我有一个列表，用于返回.txt中经常使用的单词。如何将这些文件发送到CSV（或其他Excel文件），其中一列有单词，另一列有频率。

例如，这是Counter(my_list)返回值的开头：

反击（{＆＃39;＆＃39;：3317，＆＃39;：＆＃39;：1845，＆＃39;＆＃39;：1812，＆＃39; a＆＃39;：1580，＆＃39;＆＃39;：1248，＆＃39;＆＃39;：1248，＆＃39;哈利＆＃39;：1213，＆＃39;＆＃39;：＆＃39;：＆＃39;他＆＃39;他＆＃39; 39;：1034，＆＃39;在＆＃39; 933，＆＃39;他的＆＃39;：895，＆＃39;它＆＃39;：＃39;＆＃39;＆＃39;：＆＃39;：793 ,. ..

我希望每个单词都在一个列中，比如A，以及计数，在B.中

the | 3317
to  | 1845
and | 1812
a   | 1580

等。（请注意，它可以在CSV中按字母顺序排序。我只是想在那里进行分析）。

这就是我现在所拥有的：

def create_csv(my_list):
    with open(r'myCSV.csv', 'w', newline='') as my_CSV:
        fieldnames = ['word','count']
        writer = csv.writer(my_CSV)
        writer.writerow(fieldnames)
        for key, value in my_list.items():
            writer.writerow(list(key) + [value])

此几乎有效，但每个字母都在一列中，后跟计数：

我需要改变什么才能使这个词保持在一起？

编辑：确定，这是我用来创建列表的功能。（my_file是.txt文件）

def unique_words():
    with open(my_file, encoding="utf8") as infile:
        for line in infile:
            words = line.split()
            for word in words:
                edited_word = clean_word(word)
                lst.append(edited_word)
                if edited_word not in lst:
                    lst.append(edited_word)     
    lst.sort()  
    return lst, cnt

并通过以下方式调用：

create_csv(Counter(lst))

Answer 1

不要做清单（关键）。直接放钥匙应该工作。现在，假设单词在一行和空格分开，

def Counter(my_file):
    count = {}
    with open(my_file, encoding="utf-8") as infile:
        for line in infile:
            words = line.strip().split()
            for word in words:
               #Assuming clean_word is a function to get rid of full stops, commas etc.
               edited_word = clean_word(word)
               count[edited_word] = count.get(edited_word, 0) + 1
    return count

def create_csv(my_list):
    with open(r'myCSV.csv', 'w', newline='') as my_CSV:
        fieldnames = ['word','count']
        writer = csv.writer(my_CSV)
        writer.writerow(fieldnames)
        for key, value in count.items():
            writer.writerow([key, str(value)])

将计数器结果发送到CSV，用字母分隔......只想要单词

1 个答案: