Question

假设我有一个包含以下内容的文件：

假设<tab>实际上是一个标签，而<space>实际上是一个空格。（忽略引号）

"

    <tab><tab>

    <space>
    <tab>
    The clothes at
    the superstore are
    at a discount today.
"

假设这是在文本文件中。如何删除生成的文本文件所有的空格（忽略引号：

"
    The clothes at
    the superstore are
    at a discount today.
"

Answer 1

也许这样的事情（不知道你是否需要python解决方案或cmdline-tools是否正常）：

$ cat -t INPUT
   ^I^I
^I^I
"^I
^I^I^I
^I  ghi
"

$ sed '/^[      ]*$/d' INPUT
"   
      ghi
"

即。删除仅包含空格/和/或制表符以及空白石灰的行。

Answer 2

尝试此操作，假设您不想覆盖旧文件。如果你这样做，很容易适应：

oldfile = open("EXISTINGFILENAME", "r")
data = oldfile.read()
oldfile.close()
stripped_data = data.lstrip()
newfile = open("NEWFILENAME", "w")
newfile.write(stripped_data)
newfile.close()

请注意，这只会删除前导空格，删除任何尾随空格，使用strip代替lstrip。

Answer 3

如果要在输出文件的行上保留缩进和尾随空格，请测试剥离的行，但写入原始行。

这也使用上下文管理器，并在Python 2.7中工作：

with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
    for line in fin:
        if line.strip():
           fout.write(line)

如果你想进行其他处理，我建议在它自己的函数体中定义它，并调用该函数：

def process_line(line):
    # for example
    return ''.join(('Payload:\t', line.strip().upper(), '\tEnd Payload\n'))

with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
    for line in fin:
        if line.strip():
           fout.write(process_line(line))

重新阅读您的问题，我发现您只询问有关删除文件开头的空格的问题。如果要在满足某个条件后获取源文件的每一行，可以为该条件设置一个标志，并根据该标志切换输出。

例如，如果要删除空格的初始行，处理非空白行，并且在拥有至少一行数据后不删除或处理所有空白行，则可以执行以下操作：

def process_line(line):
    # for example
    return ''.join(('Payload:\t', line.strip().upper(), '\tEnd Payload\n'))

with open('EXISTINGFILE', 'r') as fin, open('NEWFILE', 'w') as fout:
    have_paydata = False
    for line in fin:
        if line.strip():
           have_paydata = True if not have_paydata
           fout.write(process_line(line))
        elif have_paydata:
           fout.write(line)

Answer 4

strip()删除所有前导/尾随空格，然后在我们进行测试后，如果该行中还有任何字符：

with f as open("file.txt", "r"):
    for line in f:
        if len(line.strip()):
            print line

Answer 5

lstrip将从字符串的开头删除所有空格。如果需要在第一个文本行上保留前导空格，请改用正则表达式：

import re

data = '''\

    \t\t


    \t
    The clothes at
    the superstore are
    at a discount today.
'''

# Remove ALL whitespace from the start of string
print(data.lstrip())
# Remove all whitespace from start of string up to and including a newline
print(re.sub(r'^\s*\n',r'',data))

输出：

The clothes at
    the superstore are
    at a discount today.

    The clothes at
    the superstore are
    at a discount today.

以这种方式修改文件：

# A with statement closes the file on exit from the block
with open('data.txt') as f:
    data = f.read()
data = re.sub(r'^\s*\n',r'',data))
with open('data.txt','w') as f:
    f.write(data)

如何删除所有空格和换行符？

5 个答案: