我正在寻找一种删除管道文字缩进的方法。以下是使用cut -c 9-
的解决方案,该解决方案假定缩进宽度为8个字符。
我正在寻找一种可以检测到要删除的空格数量的解决方案。这意味着要遍历整个(管道)文件,以了解用于缩进的最小空格(制表符?),然后在每一行上将其删除。
help() {
awk '
/esac/{b=0}
b
/case "\$arg" in/{b=1}' \
"$me" \
| cut -c 9-
}
while [[ $# -ge 1 ]]
do
arg="$1"
shift
case "$arg" in
help|h|?|--help|-h|'-?')
# Show this help
help;;
esac
done
$ ./run.sh --help
help|h|?|--help|-h|'-?')
# Show this help
help;;
注意:echo $' 4\n 2\n 3' | python3 -c 'import sys; import textwrap as tw; print(tw.dedent(sys.stdin.read()), end="")'
可以工作,但是我希望有一种更好的方法(我的意思是,这不仅依赖于比python更常见的软件。也许awk吗?我不介意看到perl解决方案要么。
注意2:echo $' 4\n 2\n 3' | python -c 'import sys; import textwrap as tw; print tw.dedent(sys.stdin.read()),'
也可以使用(Python 2.7.15rc1)。
答案 0 :(得分:3)
以下是纯bash,没有外部工具或命令替代:
#!/usr/bin/env bash
all_lines=( )
min_spaces=9999 # start with something arbitrarily high
while IFS= read -r line; do
all_lines+=( "$line" )
if [[ ${line:0:$min_spaces} =~ ^[[:space:]]*$ ]]; then
continue # this line has at least as much whitespace as those preceding it
fi
# this line has *less* whitespace than those preceding it; we need to know how much.
[[ $line =~ ^([[:space:]]*) ]]
line_whitespace=${BASH_REMATCH[1]}
min_spaces=${#line_whitespace}
done
for line in "${all_lines[@]}"; do
printf '%s\n' "${line:$min_spaces}"
done
其输出是:
4
2
3
答案 1 :(得分:3)
假设您有:
$ echo $' 4\n 2\n 3\n\ttab'
4
2
3
tab
您可以使用Unix expand实用程序将制表符扩展到空格。然后遍历awk
以计算一行上的最小空格数:
$ echo $' 4\n 2\n 3\n\ttab' |
expand |
awk 'BEGIN{min_indent=9999999}
{lines[++cnt]=$0
match($0, /^[ ]*/)
if(RLENGTH<min_indent) min_indent=RLENGTH
}
END{for (i=1;i<=cnt;i++)
print substr(lines[i], min_indent+1)}'
4
2
3
tab
答案 2 :(得分:1)
这是(半)明显的临时文件解决方案。
#!/bin/sh
t=$(mktemp -t dedent.XXXXXXXXXX) || exit
trap 'rm -f $t' EXIT ERR
awk '{ n = match($0, /[^ ]/); if (NR == 1 || n<min) min = n }1
END { exit min+1 }' >"$t"
cut -c $?- "$t"
如果所有行都包含超过255个前导空格字符,则这样做显然会失败,因为结果将不适合Awk的退出代码。
这样做的好处是我们不会将自己限制在可用内存范围内。相反,我们将自己限制为可用的磁盘空间。缺点是磁盘可能较慢,但是恕我直言,不将大文件读入内存的优势。
答案 3 :(得分:0)
echo $' 4\n 2\n 3\n \n more spaces in the line\n ...' | \
(text="$(cat)"; echo "$text" \
| cut -c "$(echo "$text" | sed 's/[^ ].*$//' | awk 'NR == 1 {a = length} length < a {a = length} END {print a + 1}')-"\
)
说明:
echo $' 4\n 2\n 3\n \n more spaces in the line\n ...' | \
(
text="$(cat)" # Obtain the input in a varibale
echo "$text" | cut -c "$(
# `cut` removes the n-1 first characters of each line of the input, where n is:
echo "$text" | \
sed 's/[^ ].*$//' | \
awk 'NR == 1 || length < a {a = length} END {print a + 1}'
# sed: keep only the initial spaces, remove the rest
# awk:
# At the first line `NR == 1`, get the length of the line `a = length`.
# For any shorter line `a < length`, update the length `a = length`.
# At the end of the piped input, print the shortest length + 1.
# ... we add 1 because in `cut`, characters of the line are indexed at 1.
)-"
)
更新:
可以避免产生sed
。根据三元组的评论,sed的s///
可以替换awk的sub()
。这是一个更短的选项,使用n = match()
作为三位一体用户的答案。
echo $' 4\n 2\n 3\n \n more spaces in the line\n ...' | \
(
text="$(cat)" # Obtain the input in a varibale
echo "$text" | cut -c "$(
# `cut` removes the a-1 first characters of each line of the input, where a is:
echo "$text" | \
awk '
{n = match($0, /[^ ]/)}
NR == 1 || n < a {a = n}
END || a == 0 {print a + 1; exit 0}'
# awk:
# At every line, get the position of the first non-space character
# At the first line `NR == 1`, copy that lenght to `a`.
# For any line with less spaces than `a` (`n < a`) update `a`, (`a = n`).
# At the end of the piped input, print a + 1.
# a is then the minimum number of common leading spaces found in all lines.
# ... we add 1 because in `cut`, characters of the line are indexed at 1.
#
# I'm not sure the whether the `a == 0 {...; exit 0}` optimisation will let the "$text" be written to the script stdout yet (which is not desirable at all). Gotta test that when I get the time.
)-"
)
显然,在Perl 6中,也可以使用功能my &f = *.indent(*);
。
答案 4 :(得分:0)
另一个基于dawg’s answer的awk
解决方案。主要区别包括:
awk '
{
lines[++count] = $0
if (NF == 0) next
match($0, /[^ ]/)
if (length(min) == 0 || RSTART < min) min = RSTART
}
END {
for (i = 1; i <= count; i++) print substr(lines[i], min)
}
' <<< $' 4\n 2\n 3'
或全部在同一行
awk '{ lines[++count] = $0; if (NF == 0) next; match($0, /[^ ]/); if (length(min) == 0 || RSTART < min) min = RSTART; } END { for (i = 1; i <= count; i++) print substr(lines[i], min) }' <<< $' 4\n 2\n 3'
说明:
将当前行添加到数组,并递增count
变量
{
lines[++count] = $0
如果行为空,请跳至下一个迭代
if (NF == 0) next
将RSTART
设置为第一个非空格字符的起始索引。
match($0, /[^ ]/)
如果未设置min
或高于RSTART,则将前者设置为后者。
if (length(min) == 0 || RSTART < min) min = RSTART
}
读取所有输入后运行。
END {
遍历数组,并为每行仅打印一个从min
中设置的索引到行末的子字符串。
for (i = 1; i <= count; i++) print substr(lines[i], min)
}