如何使用固定行数的特定单词计算行号?

时间:2017-12-25 23:22:21

标签: python full-text-search

通过使用以下代码,我只能计算FBC字数和/或有多少FBC。但是,我想用固定行数

计算一个具有特定字的行号
    def lcount(keyword, fname):
    with open(fname, 'r') as fin:
        return sum([1 for line in fin if keyword in line])
    F=lcount('FBC', 'BLK100-199C1-J-1000-K-10.txt');
    print (F)

以下是我想从文本文件中读取的数据:

-----------------------------------------------------------------------------
`PagesPerBlock= 64                      
Block = 100                     
Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=0,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=2,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=3,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=4,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=11,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=20,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=32,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=45,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=54,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6400   FBC=71,

PagesPerBlock= 64                       
Block = 101                     
Read time= 690, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=0,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=0,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=3,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=7,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=11,

Read time= 691, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=15,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=24

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=34,

Read time= 698, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=42,

Read time= 697, Cycle= 1,55555555,55555555,55555555,    Page=6464   FBC=50,`

首先,我想先阅读10 FBC-word containing the line。其中,我想计算包含第一个非零FBC的行号。并且,重复下一个10 FBC-word containing the line的过程。

根据给定的数据&我的疑问,答案应该是 - 2,3 因为第二行包含第一个10 FBC-word containing line的第一个非零FBC,第三行包含前一个10 FBC-word containing line的第一个非零FBC

不幸的是,我不知道如何使用 Python 来做到这一点。请帮帮我。

1 个答案:

答案 0 :(得分:0)

很难理解问题是什么。简化您的输入,例如:

header1
header2
some_data_n   FBC=0,
some_data_b   FBC=1,
some_data_v   FBC=2,
some_data_c   FBC=0,
some_data_x   FBC=3,

并写下你想得到什么输出?

[编辑:] 所以你应该阅读所有的行,只用FBC语句提取这些行,然后找到包含FBC而不是FBC = 0的第一行的索引

def line_index(keyword, fname):
    with open(fname, 'r') as fin:
        # get all lines from file
        lines = fin.readlines()
        # get only lines with keyword
        lines = [ln for ln in lines if keyword in ln]
        # check where the keyword has value 0
        zero_value_str = "%s=%d" % (keyword, 0)
        presence = [zero_value_str in ln for ln in lines]

        # The first element where 0-valued FBC is not present
        index1 = presence.index(False)
        # Now we don't need this element so we switch the value for this index
        presence[index1] = True
        # now we search for the second
        index2 = presence.index(False)

        # We want to numerate indexes starting from 1, not 0, so increment them
        return index1 + 1, index2 + 1

F=line_index('FBC', 'BLK100-199C1-J-1000-K-10.txt');
print (F)

你可以很容易地在布尔列表上进行操作以找到另一个索引

请注意,这些索引的值为0,因此第二个索引的索引为1