Question

我有以下格式的数据

 _id :  ANC,Name : TEST,actn : Testing,date : 2018-0208   |    _id :  ANC,Name : TEST,actn : Testing,date : 2018-0208
                                                          >    _id :  ANC,Name : TEST,actn : Testing,date : 2018-0209
 _id :  ANC,Name : TEST,actn : Testing,date : 2018-0210   <

我想根据以下条件将数据拆分为单独的文件： -

anything before  | should go in file 1 and after | should go in file 2.
Anything after > should in file 2 
Anything before < should go in file 1

所以最后文件看起来像： - File1:-- _id : ANC,Name : TEST,actn : Testing,date : 2018-0208 _id : ANC,Name : TEST,actn : Testing,date : 2018-0210 File2 _id : ANC,Name : TEST,actn : Testing,date : 2018-0208 _id : ANC,Name : TEST,actn : Testing,date : 2018-0209

我尝试使用sed sed 's/|.*//' test.txt 但很遗憾，我无法添加所有条件，因此数据混乱。

问候。

Answer 1

使用awk的一种方式，因为你基本上有两列（假设没有其他|,<,>）：

awk -F' *[<>|] *' '{if ( $1 != "" ) { print $1 > "file1"; }; if ( $2 != "") { print $2 > "file2" } }' inputfile

-F将分隔符设置为3个特殊符号中的一个，之前和之后任意数量的快速增加。
如果第一列不为空，请将其打印到文件1。
如果第二列不为空，请将其打印到文件2。

如果您不介意几个步骤，可以用一个分隔符替换分隔符：

sed -i 's/[<>]/|/' input

然后只使用cut -d'|' -f1 > file1就行了。对于文件2也是如此 - 尽管你会有空行。你也可以使用一个bash循环并轻松地逐行迭代分割其中一个分隔符的行，但我认为awk非常适合。

Answer 2

关注简单的awk也可以帮助您。

awk -F'[|><]' '{gsub(/^ +| +$/,"")}$1{print $1 > "file1"} $2{print $2 > "file2"}'    Input_file

Answer 3

Awk 解决方案：

awk '{ for (i=1; i<=2; i++) if ($i) print $i > "file"i }' \
       FS='[[:space:]][[:space:]]+[|<>][[:space:]][[:space:]]+' file

查看结果：

$ head file[12]
==> file1 <==
_id :  ANC,Name : TEST,actn : Testing,date : 2018-0208
 _id :  ANC,Name : TEST,actn : Testing,date : 2018-0210

==> file2 <==
_id :  ANC,Name : TEST,actn : Testing,date : 2018-0208
_id :  ANC,Name : TEST,actn : Testing,date : 2018-0209

将数据拆分为单独的文件unix

3 个答案: