使用python读取文件,并查看文件中是否存在特定字符串

时间:2010-08-09 05:27:38

标签: python file compare

我有以下格式的文件

Summary;None;Description;Emails\nDarlene\nGregory Murphy\nDr. Ingram\n;DateStart;20100615T111500;DateEnd;20100615T121500;Time;20100805T084547Z
Summary;Presence tech in smart energy management;Description;;DateStart;20100628T130000;DateEnd;20100628T133000;Time;20100628T055408Z
Summary;meeting;Description;None;DateStart;20100629T110000;DateEnd;20100629T120000;Time;20100805T084547Z
Summary;meeting;Description;None;DateStart;20100630T090000;DateEnd;20100630T100000;Time;20100805T084547Z
Summary;Balaji Viswanath: Meeting;Description;None;DateStart;20100712T140000;DateEnd;20100712T143000;Time;20100805T084547Z
Summary;Government Industry Training:  How Smart is Your City - The Smarter City Assessment Tool\nUS Call-In Information:  1-866-803-2143\,     International Number:  1-210-795-1098\,     International Toll-free Numbers:  See below\,     Passcode:  6785765\nPresentation Link - Copy and paste URL into web browser:  http://w3.tap.ibm.com/medialibrary/media_view?id=87408;Description;International Toll-free Numbers link - Copy and paste this URL into your web browser:\n\nhttps://w3-03.sso.ibm.com/sales/support/ShowDoc.wss?docid=NS010BBUN-7P4TZU&infotype=SK&infosubtype=N0&node=clientset\,IA%7Cindustries\,Y&ftext=&sort=date&showDetails=false&hitsize=25&offset=0&campaign=#International_Call-in_Numbers;DateStart;20100811T203000;DateEnd;20100811T213000;Time;20100805T084547Z

现在我需要创建一个执行以下操作的函数:

函数参数将指定要读取的行,并且假设我已经完成了line.split(;)

  1. 在第[1]行的任何地方查看是否有“会议”或“号码呼叫”,并在第[2]行的任何地方查看是否有“会议”或“号码呼叫”。如果这两个中的任何一个都为真,则该函数应返回“call-in meeting”。否则它应该返回“无推断”。
  2. 提前致谢

3 个答案:

答案 0 :(得分:1)

使用in运算符查看是否存在匹配

for line in open("file"):
    if "string" in line :
        ....

答案 1 :(得分:1)

构建于ghostdog74的答案:

def finder(line):
    '''Takes line number as argument. First line is number 0.'''
    with open('/home/vlad/Desktop/file.txt') as f:
        lines = f.read().split('Summary')[1:]
        searchLine = lines[line]
        if 'meeting' in searchLine.lower() or 'call in number' in searchLine.lower():
            return 'call-in meeting'
        else:
            return 'None Inferred'

我不太明白line[1]line[2]的含义,所以这是我能做的最好的事情。

编辑:修复了\n的问题。我想,因为您正在搜索meetingcall in number,所以我不需要Summary,所以我用它来分割线条。

答案 2 :(得分:1)

vlad003是对的:如果你的行中有换行符;他们将是新线!在这种情况下,我会拆分“摘要”而不是:

import itertools

def chunks( filePath ):
    "Since you have newline characters in each section,\
    you can't read each line in turn. This function reads\
    lines of the file and splits them into chunks, restarting\
    each time 'Summary' starts a line."
    with open( filePath ) as theFile:
        chunk = [ ]
        for line in theFile:
            if line.startswith( "Summary" ):
                if chunk: yield chunk
                chunk = [ line ]
            else:
                chunk.append( line )
        yield chunk

def nth(iterable, n, default=None):
    "Gets the nth element of an iterator."
    return next(islice(iterable, n, None), default)

def getStatus( chunkNum ):
    "Get the nth chunk of the file, split it by ";", and return the result."
    chunk = nth( chunks, chunkNum, "" ).split( ";" )
    if not chunk[ 0 ]:
        raise SomeError # could not get the right chunk
    if "meeting" in chunk[ 1 ].lower() or "call in number" in chunk[ 1 ].lower():
        return "call-in meeting"
    else:
        return "None Inferred"

请注意,如果您计划读取文件的所有块,这是很愚蠢的,因为它会打开文件并在每次查询时读取一次。如果您打算经常这样做,那么将其解析为更好的数据格式(例如状态数组)是值得的。这需要一次通过文件,并为您提供更好的查找。