Question

我有一个包含以下几行的文件：

1291126929200 started 88 videolist15.txt 4 Good 4
1291126929250 59.875 29.0 29.580243595150186 43.016096916037604
1291126929296 59.921 29.0 29.52749417740926 42.78632483544682
1291126929359 59.984 29.0 29.479540161281143 42.56031951027556
1291126929437 60.046 50.0 31.345036510255586 42.682281485516945
1291126932859 started 88 videolist15.txt 5 Good 4

我想拆分包含started（或videolist的每一行的文件，无关紧要。）

以下命令仅生成2个输出文件：

$ csplit -k input.txt /started/

但是我期待更多，如下所示：

$ grep -i started input.txt |wc -l
$ 146

正确的csplit命令是什么？

由于

Answer 1

最后添加{*}：

$ csplit -k input.txt /started/ {*}

手册页说：

{*}    repeat the previous pattern as many times as possible.

演示：

$ cat file
1
foo
2
foo
3
foo
$ csplit -k file /foo/ {*}
2
6
6
4
$ ls -tr xx*             
xx03  xx02  xx01  xx00
$ csplit --version
csplit (GNU coreutils) 7.4

Answer 2

根据Open Group规范，csplit命令接受basic regular expressions。

基本REGEXP是完整正则表达式实现的有限子集。它们支持文字字符，星号（*），点（。），字符类（[0-9]）和锚点（^，$）。他们不支持一个或多个（+）或替换（a | b）。

寻找正确的csplit正则表达式

2 个答案: