Question

此代码将打印文本文件中的整行数，总字数和总字符数。它工作正常并给出预期的输出。但我想计算每行中的字符数并打印如下： -

Line No. 1 has 58 Characters
Line No. 2 has 24 Characters

代码： -

import string
def fileCount(fname):
    #counting variables
    lineCount = 0
    wordCount = 0
    charCount = 0
    words = []

    #file is opened and assigned a variable
    infile = open(fname, 'r')

    #loop that finds the number of lines in the file
    for line in infile:
        lineCount = lineCount + 1
        word = line.split()
        words = words + word

    #loop that finds the number of words in the file
    for word in words:
        wordCount = wordCount + 1
        #loop that finds the number of characters in the file
        for char in word:
            charCount = charCount + 1
    #returns the variables so they can be called to the main function        
    return(lineCount, wordCount, charCount)

def main():
    fname = input('Enter the name of the file to be used: ')
    lineCount, wordCount, charCount = fileCount(fname)
    print ("There are", lineCount, "lines in the file.")
    print ("There are", charCount, "characters in the file.")
    print ("There are", wordCount, "words in the file.")
main()

作为

for line in infile:
    lineCount = lineCount + 1

计算整行，但如何为每个行进行此操作？我使用的是Python 3.X

Answer 1

将所有信息存储在dict中，然后按键访问。

use bandwidthThrottle\tokenBucket\Rate;
use bandwidthThrottle\tokenBucket\TokenBucket;
use bandwidthThrottle\tokenBucket\storage\FileStorage;

$storage = new FileStorage(__DIR__ . "/api.bucket");
$rate    = new Rate(10, Rate::SECOND);
$bucket  = new TokenBucket(10, $rate, $storage);
$bucket->bootstrap(10);

if (!$bucket->consume(1, $seconds)) {
    http_response_code(429);
    header(sprintf("Retry-After: %d", floor($seconds)));
    exit();
}

该代码仅适用于由空格分隔的单词，因此您需要牢记这一点。

Answer 2

定义您想要计算的允许字符的df.select(grouping_columns).distinct()，然后您可以使用set获取大部分数据。
下面，我选择了字符集：

['！'，'''，'＃'，'$'，'％'，'＆amp;'，'''，'（'，'）'，'*'，'+'，' ，'，' - '，'。'，'/'，'0'，'1'，'2'，'3'，'4'，'5'，'6'，'7'，'8' ，'9'，'：'，';'，'＆lt;'，'='，'＆gt;'，'？'，'@'，'A'，'B'，'C'，'D' ，'E'，'F'，'G'，'H'，'我'，'J'，'K'，'L'，'M'，'N'，'O'，'P'，' Q'，'R'，'S'，'T'，'U'，'V'，'W'，'X'，'Y'，'Z'，'['，'\'，']' ，'^'，'_'，'`'，'a'，'b'，'c'，'d'，'e'，'f'，'g'，'h'，'i'，' j'，'k'，'l'，'m'，'n'，'o'，'p'，'q'，'r'，'s'，'t'，'u'，'v' ，'w'，'x'，'y'，'z'，'{'，'|'，'}'，'〜']

len

Answer 3

我被分配了创建程序的任务，该程序打印一行中的字符数。

作为编程的菜鸟，我发现这非常困难:(。

这是我想出的，以及他的回应 -

这是您计划的核心部分：

with open ('data_vis_tips.txt', 'r') as inFile:
    with open ('count_chars_per_line.txt', 'w') as outFile:
        chars = 0
            for line in inFile:
                line = line.strip('\n')
                chars = len(line)
                outFile.write(str(len(line))+'\n')

可以简化为：

with open ('data_vis_tips.txt', 'r') as inFile:
    for line in inFile:
        line = line.strip()
        num_chars = len(line)
        print(num_chars)

请注意，strip（）函数的参数不是必需的;它默认剥离空格，'\ n'是空格。

Answer 4

这是一个使用内置collections.Counter的简单版本，它是一个专门的dict，用于计算其输入。我们可以使用Counter.update()方法在每一行的所有单词（唯一或非单词）中啜饮：

from collections import Counter

def file_count_2(fname):

    line_count = 0
    word_counter = Counter()

    infile = open(fname, 'r')
    for line in infile:
        line_count += 1
        word_counter.update( line.split() )

    word_count = 0
    char_count = 0

    for word, cnt in word_counter.items():
        word_count += cnt
        char_count += cnt * len(word)

    print(word_counter)

    return line_count, word_count, char_count

注意：

我对此进行了测试，它为您的代码提供了相同的计数
它会更快，因为你不会迭代地附加到列表words（最好只是散列唯一的单词并存储它们的计数，这就是Counter所做的），也没有必要每当我们看到一个单词出现时迭代并增加charCount。
如果您只想word_count而不是char_count，则可以直接点击word_count = sum(word_counter.values())而无需迭代word_counter

计算文件每一行的每个单词中的字符数

4 个答案: