Question

如何将标签转换为目录的每个文件中的空格（可能是递归的）？

另外，有没有办法设置每个标签的空格数？

Answer 1

使用sed进行简单替换是可以的，但不是最好的解决方案。如果标签之间存在“额外”空格，则替换后它们仍然存在，因此边距将是不规则的。在行中间展开的选项卡也无法正常工作。在bash，我们可以说

find . -name '*.java' ! -type d -exec bash -c 'expand -t 4 "$0" > /tmp/e && mv /tmp/e "$0"' {} \;

将expand应用于当前目录树中的每个Java文件。如果您要定位其他某些文件类型，请删除/替换-name参数。正如其中一条评论所提到的，在删除-name或使用弱的通配符时要非常小心。您可以轻松地破坏存储库和其他隐藏文件。这就是为什么最初的答案包括：

在尝试这样的事情之前，你应该总是制作树的备份副本，以防出现问题。

Answer 2

尝试使用命令行工具expand。

expand -i -t 4 input | sponge output

，其中

-i用于仅展开每行的前导标签;
-t 4表示每个标签将转换为4个空白字符（默认为8个）。
sponge来自moreutils包，并且避免使用clearing the input file。

最后，在使用Homebrew（gexpand）安装coreutils之后，您可以在OSX上使用brew install coreutils。

Answer 3

警告：这将破坏你的回购。

此会损坏二进制文件，包括svn，.git下的文件！使用前请阅读评论！

find . -type f -exec sed -i.orig 's/\t/ /g' {} +

原始文件保存为[filename].orig。

缺点：

将替换文件中的所有标签。
如果您在此目录中碰巧有5GB的SQL转储，则需要很长时间。

Answer 4

收集Gene's answer的最佳评论，目前为止的最佳解决方案是使用moreutils中的sponge。

sudo apt-get install moreutils
# The complete one-liner:
find ./ -iname '*.java' -type f -exec bash -c 'expand -t 4 "$0" | sponge "$0"' {} \;

说明：

./递归搜索当前目录
-iname是不区分大小写的匹配（适用于*.java和*.JAVA个人）
type -f仅查找常规文件（无目录，二进制文件或符号链接）
-exec bash -c在子shell中为每个文件名{}
expand -t 4将所有TAB扩展为4个空格
sponge吸收标准输入（来自expand）并写入文件（同一个）*。

注意：*简单的文件重定向（> "$0"）在这里不起作用，因为it would overwrite the file too soon。

优势：保留所有原始文件权限，不使用中间tmp文件。

Answer 5

使用反斜杠转义sed。

在linux上：

在所有* .txt文件中用1个连字符替换所有选项卡：
```
sed -i $'s/\t/-/g' *.txt
```
在所有* .txt文件中替换所有内置1个空格的标签：
```
sed -i $'s/\t/ /g' *.txt
```
在所有* .txt文件中替换所有4个空格的标签：
```
sed -i $'s/\t/    /g' *.txt
```

在mac上：

在所有* .txt文件中替换所有4个空格的标签：
```
sed -i '' $'s/\t/    /g' *.txt
```

Answer 6

如何将标签转换为目录的每个文件中的空格（可能递归地）？

这通常是不你想要的。

你想为png图像做这个吗？ PDF文件？ .git目录？您的 Makefile（需要标签）？一个5GB的SQL转储？

理论上，你可以将大量的排除选项传递给find或其他任何东西否则你正在使用;但这很脆弱，一旦你添加其他东西就会中断二进制文件。

你想要什么，至少是：

跳过特定大小的文件。
通过检查是否存在NULL字节来检测文件是否为二进制文件。
仅替换文件的开始处的标签expand执行此操作，sed 没有按＆＃39;。T）

据我所知，没有＆＃34;标准＆＃34;可以做到这一点的Unix实用程序，并且使用shell一行代码不是很容易，因此需要一个脚本。

前段时间我创建了一个名为的小脚本完全正确的sanitize_files 那。它还修复了一些其他常见内容，例如将\r\n替换为\n，添加尾随\n等

你可以在下面找到一个简化的脚本，不用额外的功能和命令行参数，但是我建议您使用上面的脚本，因为它更有可能收到错误修正和其他更新比这篇文章。

我还要指出，在回答其他一些答案时，使用shell globbing是不这是一种强有力的方法，因为它更快或者以后你会得到比ARG_MAX更多的文件（现代版本） Linux系统是128k，看起来很多，但迟早它不足够）。

#!/usr/bin/env python
#
# http://code.arp242.net/sanitize_files
#

import os, re, sys


def is_binary(data):
    return data.find(b'\000') >= 0


def should_ignore(path):
    keep = [
        # VCS systems
        '.git/', '.hg/' '.svn/' 'CVS/',

        # These files have significant whitespace/tabs, and cannot be edited
        # safely
        # TODO: there are probably more of these files..
        'Makefile', 'BSDmakefile', 'GNUmakefile', 'Gemfile.lock'
    ]

    for k in keep:
        if '/%s' % k in path:
            return True
    return False


def run(files):
    indent_find = b'\t'
    indent_replace = b'    ' * indent_width

    for f in files:
        if should_ignore(f):
            print('Ignoring %s' % f)
            continue

        try:
            size = os.stat(f).st_size
        # Unresolvable symlink, just ignore those
        except FileNotFoundError as exc:
            print('%s is unresolvable, skipping (%s)' % (f, exc))
            continue

        if size == 0: continue
        if size > 1024 ** 2:
            print("Skipping `%s' because it's over 1MiB" % f)
            continue

        try:
            data = open(f, 'rb').read()
        except (OSError, PermissionError) as exc:
            print("Error: Unable to read `%s': %s" % (f, exc))
            continue

        if is_binary(data):
            print("Skipping `%s' because it looks binary" % f)
            continue

        data = data.split(b'\n')

        fixed_indent = False
        for i, line in enumerate(data):
            # Fix indentation
            repl_count = 0
            while line.startswith(indent_find):
                fixed_indent = True
                repl_count += 1
                line = line.replace(indent_find, b'', 1)

            if repl_count > 0:
                line = indent_replace * repl_count + line

        data = list(filter(lambda x: x is not None, data))

        try:
            open(f, 'wb').write(b'\n'.join(data))
        except (OSError, PermissionError) as exc:
            print("Error: Unable to write to `%s': %s" % (f, exc))


if __name__ == '__main__':
    allfiles = []
    for root, dirs, files in os.walk(os.getcwd()):
        for f in files:
            p = '%s/%s' % (root, f)
            if do_add:
                allfiles.append(p)

    run(allfiles)

Answer 7

您可以使用常用的pr命令（手册页here）。例如，要将制表符转换为四个空格，请执行以下操作：

pr -t -e=4 file > file.expanded

-t会抑制标题
-e=num将标签扩展为num空格

以递归方式转换目录树中的所有文件，同时跳过二进制文件：

#!/bin/bash
num=4
shopt -s globstar nullglob
for f in **/*; do
  [[ -f "$f" ]]   || continue # skip if not a regular file
  ! grep -qI "$f" && continue # skip binary files
  pr -t -e=$num "$f" > "$f.expanded.$$" && mv "$f.expanded.$$" "$f"
done

跳过二进制文件的逻辑来自this post。

注意：

在git或svn repo中执行此操作可能会很危险

如果您的代码文件中包含嵌入字符串文字的标签
，则这不是正确的解决方案

Answer 8

我喜欢上面的“查找”示例，用于递归应用程序。为了使其适应非递归，只改变当前目录中与通配符匹配的文件，shell glob扩展对于少量文件就足够了：

ls *.java | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh -v

如果您在相信它有效后希望它保持沉默，那么最后只需将-v放在sh命令上。

当然，您可以在第一个命令中选择任何文件集。例如，以受控方式仅列出特定的子目录（或多个目录），如下所示：

ls mod/*/*.php | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh

或者依次使用深度参数等组合运行find（1）：

find mod/ -name '*.php' -mindepth 1 -maxdepth 2 | awk '{print "expand -t 4 ", $0, " > /tmp/e; mv /tmp/e ", $0}' | sh

Answer 9

要在目录中递归转换所有Java文件，使用4个空格而不是选项卡：

find . -type f -name *.java -exec bash -c 'expand -t 4 {} > /tmp/stuff;mv /tmp/stuff {}' \;

Answer 10

我的建议是使用：

    Array ( 
      [Mount-Point] => /listen.mp3 
      [Stream-Title] => VibboStream
      [Stream-Description] => name 
      [Content-Type] => audio/mpeg
      [Mount-started] => 14/Jun/2016:04:28:49 -0500 
      [Bitrate] => 128
      [Current-Listeners] => 1 
      [Peak-Listeners] => 3 
      [Stream-Genre] => Various
      [Stream-URL] => http://url [ice-bitrate] => 128
      [icy-info] => ice-samplerate=44100;ice-bitrate=128;ice-channels=2
      [Current-Song] => Artist - Title
    )

评论：

使用就地编辑。将备份保留在VCS中。无需生成* .orig文件。最好将结果与最后一次提交区分开来，以确保无论如何都能按预期工作。
find . -name '*.lua' -exec ex '+%s/\t/ /g' -cwq {} \;是一个流编辑器。使用sed进行就地编辑。这样可以避免为top answer中的每个替换创建额外的临时文件和生成shell。
警告：这会使所有标签混乱，而不仅仅是用于压痕的标签。此外，它不会对标签进行上下文感知替换。这对我的用例来说足够了。但可能不适合你。
编辑：此答案的早期版本使用ex代替find|xargs。正如@ gniourf-gniourf所指出的，这会导致文件名中的空格，引号和控制字符出现问题。 Wheeler。

Answer 11

您可以将find与Go Playground包一起使用。

首先，安装tabs-to-spaces

npm install -g tabs-to-spaces

然后，从项目的根目录运行此命令;

find . -name '*' -exec t2s --spaces 2 {} \;

这会在每个文件中用tab替换每个spaces个字符。

Answer 12

下载并运行以下脚本，以便将硬标签递归转换为纯文本文件中的软标签。

从包含纯文本文件的文件夹中执行脚本。

#!/bin/bash

find . -type f -and -not -path './.git/*' -exec grep -Iq . {} \; -and -print | while read -r file; do {
    echo "Converting... "$file"";
    data=$(expand --initial -t 4 "$file");
    rm "$file";
    echo "$data" > "$file";
}; done;

Answer 13

在找到混合标签和空格后，我使用astyle重新缩进所有C / C ++代码。如果您愿意，它还可以选择强制特定的支撑样式。

Answer 14

在其他答案中建议使用expand似乎是单独完成此任务的最合乎逻辑的方法。

也就是说，它也可以用Bash和Awk完成，以防你可能想要做一些其他修改。

如果使用Bash 4.0或更高版本，可以使用shopt builtin globstar以**递归搜索。

使用GNU Awk 4.1或更高版本，可以像“inplace”文件一样进行修改：

shopt -s globstar
gawk -i inplace '{gsub("\t","    ")}1' **/*.ext

如果您想设置每个标签的空格数：

gawk -i inplace -v n=4 'BEGIN{for(i=1;i<=n;i++) c=c" "}{gsub("\t",c)}1' **/*.ext

Answer 15

可以使用vim：

find -type f \( -name '*.css' -o -name '*.html' -o -name '*.js' -o -name '*.php' \) -execdir vim -c retab -c wq {} \;

正如Carpetsmoker所说，它将根据您的vim设置进行重新分类。和文件中的模型，如果有的话。此外，它不仅会在行的开头替换标签。这不是你通常想要的。例如，您可能有文字，包含标签。

Answer 16

Git存储库友好方法

git-tab-to-space() (
  d="$(mktemp -d)"
  git grep --cached -Il '' | grep -E "${1:-.}" | \
    xargs -I'{}' bash -c '\
    f="${1}/f" \
    && expand -t 4 "$0" > "$f" && \
    chmod --reference="$0" "$f" && \
    mv "$f" "$0"' \
    '{}' "$d" \
  ;
  rmdir "$d"
)

对当前目录下的所有文件进行操作：

git-tab-to-space

仅对C或C ++文件起作用：

git-tab-to-space '\.(c|h)(|pp)$'

您可能特别希望这样做，因为那些烦人的Makefile需要使用制表符。

命令git grep --cached -Il ''：

仅列出跟踪的文件，因此.git内没有任何内容
排除目录，二进制文件（将被破坏）和符号链接（将被转换为常规文件）

如How to list all text (non-binary) files in a git repository?

所述

chmod --reference保持文件权限不变：https://unix.stackexchange.com/questions/20645/clone-ownership-and-permissions-from-another-file不幸的是，我can't find a succinct POSIX alternative。

如果您的代码库有疯狂的想法，允许在字符串中使用功能性的原始制表符，请使用：

expand -i

，然后很有趣地一一遍历所有非行首选项卡，您可以使用以下列表列出：Is it possible to git grep for tabs?

在Ubuntu 18.04上测试。

Answer 17

没有人提到rpl？使用rpl可以替换任何字符串。要将制表符转换为空格，

rpl -R -e "\t" "    "  .

非常简单。

Answer 18

使用vim-way：

$ ex +'bufdo retab' -cxa **/*.*

^{进行备份！，因为它可能会损坏您的二进制文件。}
^{要使用globstar（**）进行递归，请按shopt -s globstar激活。}
^{要指定特定的文件类型，请使用例如：**/*.c。}

要修改tabstop，请添加+'set ts=2'。

然而，缺点是它可以replace tabs inside the strings。

因此，对于稍微更好的解决方案（通过使用替换），请尝试：

$ ex -s +'bufdo %s/^\t\+/  /ge' -cxa **/*.*

或者使用ex编辑器+ expand实用程序：

$ ex -s +'bufdo!%!expand -t2' -cxa **/*.*

对于尾随空格，请参阅：How to remove trailing whitespaces for multiple files?

您可以将以下功能添加到.bash_profile：

中

# Convert tabs to spaces.
# Usage: retab *.*
# See: https://stackoverflow.com/q/11094383/55075
retab() {
  ex +'set ts=2' +'bufdo retab' -cxa $*
}

Answer 19

仅在“.lua”文件中将标签转换为空格[tabs - ＆gt; 2个空格]

find . -iname "*.lua" -exec sed -i "s#\t#  #g" '{}' \;

如何将选项卡转换为目录的每个文件中的空格？

19 个答案:

警告：这将破坏你的回购。