Question

我有几个日志文件，大多数都超过100万行。我不想删除每个文件的前三行以及第四行的前9行。

我可以删除前3行，但是，我还无法弄清楚如何删除第4行的前9个字符并保留文档的其余部分。

示例数据：

#Software: Microsoft Internet Information Services 7.5
#Version: 1.0
#Date: 2015-06-02 00:00:00
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-  username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken

期望的输出：

date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken

我到目前为止的代码：

for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first 3 lines
        for n in xrange(3):
            read.readline()
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)

Answer 1

你非常接近：

for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first 3 lines
        for n in xrange(3):
            read.readline()
        # consume 9 bytes    <<<<<< ADDED THIS <<<<<
        read.read(9)  #      <<<<<< ADDED THIS <<<<<
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)

Answer 2

你有99％的路在那里。其余的是在复制之前将读指针前进9个字符。

    #skip first 3 lines
    for n in xrange(3):
        read.readline()
    # Skip 9 characters
    read.read(9)
    # hand the rest to shutil.copyfileobj
    with open(newname, 'w') as write:
        shutil.copyfileobj(read, write)

Answer 3

感谢您提供的信息......虽然我无法获得read.read（）选项，但是关于向前移动读取指针的注释却指向了正确的方向。

我选择了将指针位置提前108，然后读取文件。

有效的最终代码：

for filename in os.listdir(path):
    basename, ext = os.path.splitext(filename)
    fullname = os.path.join(path, filename)
    newname = os.path.join(path, basename + '-out' + ext)
    with open(fullname) as read:
        #skip first two lines
        read.seek(108)
        for n in xrange(0):            
            read.readline()
        # hand the rest to shutil.copyfileobj
        with open(newname, 'w') as write:
            shutil.copyfileobj(read, write)

如何使用python删除文件的前几个字符？

3 个答案: