Bash通过多字符定界符将多行字符串拆分为一个数组

时间:2019-01-25 06:12:47

标签: bash shell awk

我在这里搜索了类似的主题,但是大多数问题都包含单字符定界符。

我有以下文本示例:

Some text here,
continuing on next lineDELIMITERSecond chunk of text
which may as well continue on next lineDELIMITERFinal chunk

所需的输出是一个列表(extracted=()),其中包含:

  1. Some text here, continuing on next line
  2. Second chunk of text which may as well continue on next line
  3. Final chunk

从示例中可以看出,“ DELIMITER”用作分割定界符。

我已经尝试了很多示例,包括awk,替换等。

6 个答案:

答案 0 :(得分:2)

如果您不想更改默认的RS值,那么可以请尝试以下操作。

awk '{gsub("DELIMITER",ORS)} 1' Input_file

答案 1 :(得分:1)

您可以尝试使用数组。

#!/bin/bash
str="continuing on next lineDELIMITERSecond chunk of text
which may as well continue on next lineDELIMITERFinal chunk";


delimiter=DELIMITER
s=$str$delimiter

array=();
while [[ $s ]]; do
array+=( "${s%%"$delimiter"*}" );
s=${s#*"$delimiter"};
done;
declare -p array

这将根据分隔符将文本分成数组,结果将是文本数组。

array =([0] =“在下一行继续” [1] = $'第二个文本块\ n也可以在下一行继续''[2] =“最后一个块”)

您可以使用数组索引访问每一行,也可以使用以下命令打印所有行         printf'%s \ n'“ $ {array [@]}”“

结果将是

继续下一行 第二段文字 最好在下一行继续 最终块

该解决方案使您有机会处理大量文字。

答案 2 :(得分:0)

您可以尝试以下操作:

awk 'BEGIN {RS="DELIMITER";} {print}' input_file

然后将其分配给变量,等等...

答案 3 :(得分:0)

使用AWK,请尝试以下操作:

awk -v RS='^$' -v FS='DELIMITER' '{
    n = split($0, extracted)
    for (i=1; i<=n; i++) {
        print i". "extracted[i]
    }
}' sample.txt

产生:

1. Some text here,
continuing on next line
2. Second chunk of text
which may as well continue on next line
3. Final chunk

如果您需要将awk阵列转移到bash阵列,则需要根据阵列上的后续过程采取进一步的步骤。

答案 4 :(得分:0)

我认为问题中的最大挑战是正确处理空格,换行符和DELIMITER,然后将所有内容放入数组中。如果只拆分文件,那将太容易了。这个模板怎么样:

#!/bin/bash
gencode(){
  echo -e "extracted=(); read -r -d '' item <<-DELIMITER"
  sed 's:DELIMITER:\n&\nextracted+=("$item"); read -r -d "" item <<-&\n:' Input_file;
  echo -e "DELIMITER\n"'extracted+=("$item")'
}
gencode|cat -n                                 # for explaination purpose only
eval "`gencode`"                               # do not remove "eval"
for (( i=0; i < ${#extracted[@]}; i++ )); do   # print results
  echo "$i: ${extracted[i]}"
done

输出

     1  extracted=(); read -r -d '' item <<-DELIMITER
     2  Some text here,
     3  continuing on next line
     4  DELIMITER
     5  extracted+=("$item"); read -r -d "" item <<-DELIMITER
     6  Second chunk of text
     7  which may as well continue on next line
     8  DELIMITER
     9  extracted+=("$item"); read -r -d "" item <<-DELIMITER
    10  Final chunk
    11  DELIMITER
    12  extracted+=("$item")
0: Some text here,
continuing on next line
1: Second chunk of text
which may as well continue on next line
2: Final chunk

答案 5 :(得分:0)

您可以尝试Perl。使用-0777选项,perl将整个文件插入到$ _变量中。然后,您可以使用DELIMITER分割内容。检查一下。

$ perl -0777 -ne '@x=split("DELIMITER");print join("\n\n",@x) ' hubbs.txt
Some text here,
continuing on next line

Second chunk of text
which may as well continue on next line

Final chunk

$

在打印时添加阵列位置

$ perl -0777 -ne '@x=split("DELIMITER"); for(@x) { print ++$i,". $_\n"  } ' hubbs.txt
1. Some text here,
continuing on next line
2. Second chunk of text
which may as well continue on next line
3. Final chunk


$