在一个文件中计算单词,在另一个Python中

时间:2016-11-13 05:09:07

标签: python

我正在尝试计算文件中的文字出现在另一个文件中的次数。我被引导到下面的链接,这是有帮助的,但仍然没有做到理想的职责。有人能帮助我吗?

https://codereview.stackexchange.com/questions/144074/program-to-count-vowels

def count_happyW(file):
    hap_count = 0
    for Hwords in file.readlines():
        line = file.readline()
        while line != "":
            for item in Hwords:
                if item in file:
                    count_happyW[item] += 1
                    return hap_count

我也试过

line = file.readline()
total = 1 * [len(h_words) for line in file.readline()]
for token in file.readlines():   
    while line != "":     
        line = file.readline()     
        for item in h_words:         
            if item in file:              
                total = [1] * len(item)

2 个答案:

答案 0 :(得分:0)

yourwords.txt包含您要搜索空格的单词,我的内容:

  

apple orange bananna

yourfile.txt是您搜索的文件:

  

apple orange bananna

     橙树上的一个苹果

wordcount = {}
with open('yourwords.txt', 'r') as f1, open('yourfile.txt', 'r') as f2:
    words = f1.read().split()
    wordcount = { i:0 for i in words}
    for line in f2:
        line_split = line.split()
        for word in line_split:
          if word in wordcount: 
            wordcount[word] += 1

print(wordcount) 

输出:

  

{' bananna':1,' apple':2,' orange':2}

答案 1 :(得分:0)

根据您的问题,我假设您有两个文件。第一个文件将包含您要搜索的单词,用新行分隔。第二个文件将包含一些文本。

文件1:(words.txt)

dog
cat
went

文件2:(story.txt)

Today my cat and dog ran out of my backyard.
This is not the first time my dog has ran away. 
Last time he went to the dog park and then went to my neighbors house.

首先,您需要创建一个包含words.txt文件中每个关键字的字典。您可以将此值视为在第二个文件中看到的次数。

wordDB = { 'dog': 0, 'cat': 0, 'went' : 0}

要动态执行此操作,首先要创建一个空字典,然后循环words.txt文件中的行。

wordDB = {}
wordFile = open('words.txt','r')
for line in wordFile.readlines():
    word = line.replace('\n','') #This replaces the new line character
    if not(word in wordDB.keys()): #Checks that the word doesn't already exist.
        wordDB[word] = 0 # Adds the word to the DB.
wordFile.close()

现在我们需要打开第二个文件并循环遍历该文件中的每一行。对于该文件中的每一行,我们将检查wordDB中的每个键,如果它存在,则分别增加其计数。

storyFile = open('story.txt','r')
checkWordList = wordDB.keys()
for line in storyFile.readlines():
    wordList = line.replace('\n','').split(' ')
    for eachWord in checkWordList:
        if eachWord in wordList:
            wordDB[eachWord] = (wordDB[eachWord] + wordList.count(eachWord) )

storyFile.close()

现在你只需要再次遍历checkWordList并打印出wordDB中的值。

for eachWord in checkWordList:
    print "%s : %s" % (eachWord, wordDB[eachWord])

你会得到输出:

went: 2
dog: 3
cat: 1