在文件python中选定单词的独特字数

时间:2017-07-05 14:20:39

标签: python python-2.7

使用Python 2.7,我打开一个包含以下内容的外部文件:

Text another word1
Lorem ipsem word1 something
hello first word2 post 

我只想对单词word执行唯一计数,而不是在第3行。

期望的输出:

$ script.py
2x word1
1x word2 

到目前为止我得到的但是失败了......:

import os
import sys
from collections import Counter

with open('./file.txt', 'r') as file:
        for item in file:
            if '.sh' in item:
                    all = item.split()[2]
                    print Counter(all.split())

3 个答案:

答案 0 :(得分:0)

如果我正确理解您的问题,您需要计算纯文本文件中指定单词的出现次数。

为此,我建议将文本文件转换为单词列表

f = open("yourfile.txt","r") # Open the file
txt = f.read()               # read it
f.close()                    # always close it
a1 = txt.split("\n")         # split each line
a2 = []                      # create an empty array
for i in a1:                 # for each line
  a2 += a1.split(" ")        # append every word

然后只需使用

a2.count("yourword")

您可以尝试here

答案 1 :(得分:0)

嘿如果你想打印所有以word开头的单词,这里是代码

import os
import sys

occurenceDict = {}
with open('./file.txt', 'r') as file:
    for line in file:  # reading each line of the file
        for word in line.split():  # splitting the line into words
           if word.find('word') != -1: # find the occurrence of word
                if word in occurenceDict: # check if word in dict
                    occurenceDict[word] += 1
                else:
                    occurenceDict[word] = 1

for word in occurenceDict:
    print str(occurenceDict[word])+'x'+" "+word

<强>输出:

2x word1
1x word2 

答案 2 :(得分:0)

你可以试试这个:

from itertools import chain

f = open('datafile.txt').readlines()

f = [i.strip('\n').split() for i in f]

f = list(chain(*f))

new = {i:f.count(i) for i in f if "word" in i}

for a, b in new.items():
    print str(b)+"x"+" "+a