Python代码从txt文件中删除标头

时间:2019-04-17 23:40:59

标签: python batch-file

我有一个包含数千个.txt文件的文件夹。我正在使用Windows批处理代码从该文件夹内的所有.txt文件中删除标题(第1行至第82行)。关键是此代码对于较小的文件来说效果很好,但是现在我需要在大文件上使用它,而该代码根本无法响应。

有人可以帮我在python上编写此Windows批处理的代码吗?预先谢谢你。

@echo off
for %%f in (*.txt) do (
    more +82 "%%f" > "%TEMP%\%%f"
    move /y "%TEMP%\%%f" "%%f" > nul
)
echo Done.

2 个答案:

答案 0 :(得分:0)

可能是矫kill过正,但这可能有效:

import tempfile
from io import StringIO
data = StringIO()

file_path = r'C:\Users\...\...'

# Set the numder of lines you'd like to exclude
header_end = 82


### Read your data into a StringIO container (untested for directory read!)
for i in os.listdir(file_path):
    if i.endswith('.txt'):
        with open(os.path.join(file_path, i), 'r') as f:
            data.write(f.read())

        ### Split linkes by \n (newline)
        tokens = data.getvalue().split('\n')

        ### Rejoin with a newline, but start at the header index value plus one.
        output_str = '\n'.join(tokens[header_end + 1:])

        ### Create a tempfile with '.txt' suffix; print(path) to find out file location (should be in temp folder)
        fd, path = tempfile.mkstemp(suffix='.txt')
        try:
            with os.fdopen(fd, 'w') as tmp:
                tmp.write(output_str)
        except IOError:
            print('Error writing temp file.')


        ### To rcleanup and remove the file
        if os.path.isfile(path):
            try:
                os.remove(path)
            finally:
                os.unlink(path)

答案 1 :(得分:0)

PowerShell脚本不写入临时文件,而是将原始文件移至bak文件,然后跳过前82行。

foreach ($File in (Get-ChildItem *.txt)){
  $BakFile = $File.FullName -replace 'txt$','bak.txt'
  Move-Item $File $BakFile -Force
  Get-Content $BakFile | Select-Object -Skip 82 | Set-Content $File
}

要成为主题,同样要放在批处理命令/文件中

powershell -NoP -C "foreach ($File in (Get-ChildItem *.txt)){$BakFile = $File.FullName -replace 'txt$','bak.txt';Move-Item $File $BakFile -Force;Get-Content $BakFile | Select-Object -Skip 82 | Set-Content $File}"