我有一个包含数千个.txt文件的文件夹。我正在使用Windows批处理代码从该文件夹内的所有.txt文件中删除标题(第1行至第82行)。关键是此代码对于较小的文件来说效果很好,但是现在我需要在大文件上使用它,而该代码根本无法响应。
有人可以帮我在python上编写此Windows批处理的代码吗?预先谢谢你。
@echo off
for %%f in (*.txt) do (
more +82 "%%f" > "%TEMP%\%%f"
move /y "%TEMP%\%%f" "%%f" > nul
)
echo Done.
答案 0 :(得分:0)
可能是矫kill过正,但这可能有效:
import tempfile
from io import StringIO
data = StringIO()
file_path = r'C:\Users\...\...'
# Set the numder of lines you'd like to exclude
header_end = 82
### Read your data into a StringIO container (untested for directory read!)
for i in os.listdir(file_path):
if i.endswith('.txt'):
with open(os.path.join(file_path, i), 'r') as f:
data.write(f.read())
### Split linkes by \n (newline)
tokens = data.getvalue().split('\n')
### Rejoin with a newline, but start at the header index value plus one.
output_str = '\n'.join(tokens[header_end + 1:])
### Create a tempfile with '.txt' suffix; print(path) to find out file location (should be in temp folder)
fd, path = tempfile.mkstemp(suffix='.txt')
try:
with os.fdopen(fd, 'w') as tmp:
tmp.write(output_str)
except IOError:
print('Error writing temp file.')
### To rcleanup and remove the file
if os.path.isfile(path):
try:
os.remove(path)
finally:
os.unlink(path)
答案 1 :(得分:0)
PowerShell脚本不写入临时文件,而是将原始文件移至bak文件,然后跳过前82行。
foreach ($File in (Get-ChildItem *.txt)){
$BakFile = $File.FullName -replace 'txt$','bak.txt'
Move-Item $File $BakFile -Force
Get-Content $BakFile | Select-Object -Skip 82 | Set-Content $File
}
要成为主题,同样要放在批处理命令/文件中
powershell -NoP -C "foreach ($File in (Get-ChildItem *.txt)){$BakFile = $File.FullName -replace 'txt$','bak.txt';Move-Item $File $BakFile -Force;Get-Content $BakFile | Select-Object -Skip 82 | Set-Content $File}"