如何使用python删除文件的前几个字符?

时间:2015-08-20 20:04:59

标签: python python-2.7 slice

我有几个日志文件,大多数都超过100万行。 我不想删除每个文件的前三行以及第四行的前9行。

我可以删除前3行,但是,我还无法弄清楚如何删除第4行的前9个字符并保留文档的其余部分。

示例数据:

#Software: Microsoft Internet Information Services 7.5
#Version: 1.0
#Date: 2015-06-02 00:00:00
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-  username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken

期望的输出:

date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken

我到目前为止的代码:

for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first 3 lines
        for n in xrange(3):
            read.readline()
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)

3 个答案:

答案 0 :(得分:1)

你非常接近:

for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first 3 lines
        for n in xrange(3):
            read.readline()
        # consume 9 bytes    <<<<<< ADDED THIS <<<<<
        read.read(9)  #      <<<<<< ADDED THIS <<<<<
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)

答案 1 :(得分:0)

你有99%的路在那里。其余的是在复制之前将读指针前进9个字符。

    #skip first 3 lines
    for n in xrange(3):
        read.readline()
    # Skip 9 characters
    read.read(9)
    # hand the rest to shutil.copyfileobj
    with open(newname, 'w') as write:
        shutil.copyfileobj(read, write)

答案 2 :(得分:0)

感谢您提供的信息......虽然我无法获得read.read()选项,但是关于向前移动读取指针的注释却指向了正确的方向。

我选择了将指针位置提前108,然后读取文件。

有效的最终代码:

for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first two lines
        read.seek(108)
        for n in xrange(0):            
            read.readline()
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)