Question

假设我有一个包含100行的文件temp.txt。我想分成10个部分。我使用以下命令

split a 1 -l 10 -d temp.txt temp_

但我得到了temp_0，temp_1，temp_2，...，temp_9。我想输出像temp_1，temp_2，..，temp_10。

来自man split 我得到了

-d, --numeric-suffixes
              use numeric suffixes instead of alphabetic

我试过用 split -l 10 --suffix-length=1 --numeric-suffixes=1 Temp.txt temp_

它说split: option '--numeric-suffixes' doesn't allow an argument

然后，我试着用 split -l 10 --suffix-length=1 --numeric-suffixes 1 Temp.txt temp_

它说 split: extra operand temp_'`

split --version的输出是

split (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Torbj�rn Granlund and Richard M. Stallman.

Answer 1

我尝试使用split -a 1 -l 10 -d 1 Temp.txt temp_。但它显示错误分裂：额外的操作数临时_＆＃39; `

-d没有参数。它应该按照你最初的尝试编写;

split -a 1 -l 10 -d Temp.txt temp_

但是，暂时忘记语法变化;

您要求它将 100 行文件拆分为 10 部分，后缀长度 1 ，从 1 开始。

^ - 这种情况是错误的，因为它要求命令处理100行并为其提供固定参数，将其限制为仅处理90行。

如果您愿意将允许的后缀长度扩展为2，那么您至少会从01开始获得统一的两位数临时文件;

split -a 1 -l 10 --numeric-suffixes=1 -d Temp.txt temp_ 将创建： temp_01 直到 temp_10

您实际上可以完全否定-a和-d参数;

split -l 10 --numeric-suffixes=1 Temp.txt temp_ 还创建： temp_01 通过 temp_10

如果由于某种原因，这是一个固定的绝对要求或永久解决方案（即整合到你无法控制的其他东西），并且总是将是一个完全100行的文件，那么你总是可以两次通过;

cat Temp.txt | head -n90 | split -a 1 -l 10 --numeric-suffixes=1 - temp_
cat Temp.txt | tail -n10 | split -a 2 -l 10 --numeric-suffixes=10 - temp_

然后你会 temp_1 通过 temp_10

Answer 2

只是抛出一个可能的选择，你可以通过运行几个循环手动完成这个任务。外部循环遍历文件块，内部循环遍历块中的行。

{
    suf=1;
    read -r; rc=$?;
    while [[ $rc -eq 0 || -n "$REPLY" ]]; do
        line=0;
        while [[ ($rc -eq 0 || -n "$REPLY") && line -lt 10 ]]; do
            printf '%s\n' "$REPLY";
            read -r; rc=$?;
            let ++line;
        done >temp_$suf;
        let ++suf;
    done;
} <temp.txt;

注意：

如果我们尚未达到文件结尾（在这种情况下$rc -eq 0 || -n "$REPLY"为真），我们需要测试$rc eq 0才能继续处理或到达文件结尾但输入文件中有一个非空的最后一行（在这种情况下-n "$REPLY"为真）。尝试支持非空的最终行没有行尾分隔符的情况是很好的，有时会发生这种情况。在这种情况下，read将返回失败的状态代码，但仍会正确设置$REPLY以包含非空的最终行内容。我已经测试了split实用程序，它也正确地处理了这个案例。
通过在外部循环之前调用read一次，然后在每次打印之后调用，我们确保始终测试读取是否成功之前到打印生成的行。一个更天真的设计可能会立即连续阅读和打印，而不会检查，这是不正确的。

我使用-r的{{1}}选项来阻止您可能不想要的反斜杠插值;我假设您要保留read verbatim。
的内容

显然，这个解决方案存在权衡。一方面，它需要相当多的复杂性和代码冗长（我编写的方式为13行）。但优点是完全控制分割操作的行为;您可以根据自己的喜好自定义脚本，例如根据行号动态更改后缀，使用前缀或中缀或其组合，甚至考虑temp.txt中单个文件行的内容。 / p>

使用带有数字后缀但没有开始零的输出文件拆分文件

2 个答案: