我有一个看起来像这样的字符串
807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482
我需要这样的输出:
S:S6S11,07001,23668732,1,1496851208,807262,7482
我需要像这样分隔列的字符串:
S:S6 + the next 3 characters;
在这种情况下S:S6S11
可行:
echo 807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482 |
grep -P -o 'F:S6.{1,3}'
输出:
S:S6S11
这让我接近,只得到数字
echo 807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482 |
grep -o '[0-9]\+' | tr '\n' ','
输出:
807001,6,11,23668732,1,1496851208,807262,7482,
如何在输出的开头获取S:S6S11
并在此之后避免6,11
?
如果用sed或awk可以做得更好,我不介意。
字符串的其余部分是:
我只需要数字,但它们必须与字母相对应。
答案 0 :(得分:3)
awk
救援!
$ echo "807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482" |
awk '{pre=gensub(".*(S:S6...).*","\\1","g"); ## extract prefix
sub(/./,","); ## replace first char with comma
gsub(/[^0-9]+/,","); ## replace non-numeric values with comma
print pre $0}' ## print prefix and replaced line
S:S6S11,07001,6,11,23668732,1,1496851208,807262,7482
答案 1 :(得分:1)
...或sed
:
$ echo "807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482" | sed -re 's/^.([0-9]+)(S:S6...)ABB([0-9]+)CC([0-9]+)DD([0-9]+)\.([0-9]+)EE([0-9]*)$/\2,\1,\3,\4,\5,\6,\7/'
S:S6S11,07001,23668732,1,1496851208,807262,7482
也就是说,如果你的行格式是固定的。
答案 2 :(得分:0)
如果你使用GNU awk,你可以通过将RS
定义为所需的模式来简化任务,例如:
parse.awk
BEGIN { RS = "S:S6...|\n" }
# Start of the string
RT != "\n" {
sub(".", ",") # Replace first char by a comma
pst = $0 # Remember the rest of the string
pre = RT # Remember the S:S6 pattern
}
# End of string
RT == "\n" {
gsub("[A-Z.]+", ",") # Replace letters and dots by commas
print pre pst $0 # Print the final result
}
运行例如它是这样的:
s=807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482
gawk -f parse.awk <<<$s
输出:
S:S6S11,07001,23668732,1,1496851208,807262,7482
答案 3 :(得分:0)
以下是使用sed
:
parse.sed
h # Duplicate string to hold space
s/.*(S:S6...).*/\1/ # Extract the desired pattern
x # Swap hold and pattern space
s/S:S6...// # Remove pattern (still in hold space)
s/[A-Z.]+/,/g # Replace letters and dots with commas
s/./,/ # Replace first char with comma
G # Append hold space content
s/([^\n]+)\n(.*)/\2\1/ # Rearrange to match desired output
像这样运行:
s=807001S:S6S11ABB23668732CC1DD1496851208.807262EE7482
sed -Ef parse.sed <<<$s
输出:
S:S6S11,07001,23668732,1,1496851208,807262,7482
答案 4 :(得分:-1)
听起来这可能是你真正想要做的事情:
$ awk -F'[A-Z]{2,}|[.]' -v OFS=',' '{$1=substr($1,7) OFS substr($1,2,5)}1' file
S:S6S11,07001,23668732,1,1496851208,807262,7482
但您对如何匹配以及匹配哪些内容的要求非常不清楚,只有一个示例输入行并没有多大帮助。