如何计算Python中的一个特定单词?

时间:2016-07-15 16:34:27

标签: python

我想计算文件中的特定单词。

例如,苹果'多少次?出现在文件中。 我试过这个:

#!/usr/bin/env python
import re 

logfile = open("log_file", "r") 

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
for k,v in wordcount.items():
    print k, v

替换'字'使用' apple',但它仍会计算我文件中所有可能的字词。

任何建议都将不胜感激。 :)

6 个答案:

答案 0 :(得分:13)

你可以使用str.count(),因为你只关心单个词的出现:

with open("log_file") as f:
    contents = f.read()
    count = contents.count("apple")

但是,为了避免某些极端情况,例如错误地计算"applejack"之类的字词,我建议您使用regex

import re

with open("log_file") as f:
    contents = f.read()
    count = sum(1 for match in re.finditer(r"\bapple\b", contents))
正则表达式中的

\b确保模式在字边界上开始和结束(与较长字符串中的子字符串相对)。

答案 1 :(得分:5)

如果您只关心一个单词,那么您不需要创建字典来跟踪每个字数。您可以逐行遍历文件并找到您感兴趣的单词的出现位置。

#!/usr/bin/env python

logfile = open("log_file", "r") 

wordcount=0
my_word="apple"
for line in logfile:
    if my_word in line.split():
        wordcount += 1

print my_word, wordcount

但是,如果您还要计算所有单词,只打印您感兴趣的单词的单词计数,那么对代码的这些微小更改应该有效:

#!/usr/bin/env python
import re 

logfile = open("log_file", "r") 

wordcount={}
for word in logfile.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1
# print only the count for my_word instead of iterating over entire dictionary
my_word="apple"
print my_word, wordcount[my_word]

答案 2 :(得分:1)

您可以将Counter字典用于此

from collections import Counter

with open("log_file", "r") as logfile:
    word_counts = Counter(logfile.read().split())

print word_counts.get('apple')

答案 3 :(得分:0)

这是计算单词数组中单词的示例。我假设文件阅读器非常相似。

def count(word, array):
    n=0
    for x in array:
        if x== word:
            n+=1
    return n

text= 'apple orange kiwi apple orange grape kiwi apple apple'
ar = text.split()

print(count('apple', ar))

答案 4 :(得分:0)

import { Component, OnInit } from '@angular/core';
import { BrxDelta } from '../model/BrxDelta';
import { ChangeDetectorRef } from '@angular/core';


@Component({
  selector: 'app-all-deltas',
  // templateUrl: 'all-deltas.component.html',
  template:`<kendo-splitter orientation="horizontal" style="height: 340px;">

              <ng-template ngFor let-brxDelta [ngForOf]="brxDeltas" let-i="index">

                <kendo-splitter-pane [collapsible]="true" [resizable]="true">
                  <h3>{{brxDelta.deltaName}}</h3>
                  <ng-container *ngTemplateOutlet="deltaPane; context:brxDelta"></ng-container>
                </kendo-splitter-pane>

              </ng-template>

            </kendo-splitter>

            <ng-template #deltaPane>

                <div class="pane-content">
                  <h3>{{deltaName}}</h3>
                </div>

            </ng-template>`
  styleUrls: ['all-deltas.component.scss']
})
export class AllDeltasComponent implements OnInit {

  brxDeltas: BrxDelta[];

  constructor(private cdRef:ChangeDetectorRef) {
    this.brxDeltas = [
      new BrxDelta("DELTA1"),
      new BrxDelta("DELTA2"),
      new BrxDelta("DELTA3")
    ];
  }

  ngOnInit() {
    console.log("BrxTableListComponent init");
    console.log( this.brxDeltas );
  }


  ngAfterViewChecked()
  {
    console.log( "! Detect changes !" );
    this.cdRef.detectChanges();
  }
}

答案 5 :(得分:-2)

fi=open("text.txt","r")
cash=0
visa=0
amex=0
for line in fi:
    k=line.split()
    print(k)
    if 'Cash' in k:
        cash=cash+1
    elif 'Visa' in k:
        visa=visa+1
    elif 'Amex' in k:
        amex=amex+1

print("# persons paid by cash are:",cash)
print("# persons paid by Visa card are :",visa)
print("#persons paid by Amex card are :",amex)
fi.close()