在这种情况下,我有一个管道分隔的文本文件,其中一个字段包含管道符。我已经有一个sed脚本,将其更改为制表符分隔,但问题是它非常慢。它将替换第一次出现的管道8次,然后将管道的最后一次出现更换4次。我希望有更快的方法来做我需要的事情。
任何想法都将不胜感激。这是我目前的sed脚本:
sed 's/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/\(.*\)|/\t/;s/\(.*\)|/\t/;s/\(.*\)|/\t/;s/\(.*\)|/\t/' $1 > $1.tab
谢谢,
-Dan
答案 0 :(得分:2)
sed 's/\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|/\1\t\2\t\3\t\4\t\5\t\6\t\7\t\8\t/;s/|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)$/\t\1\t\2\t\3\t\4/'
HTH
答案 1 :(得分:1)
这有点可扩展,但它仍然是一个眼睛。您可以更改“8”和“4”以选择要替换的管道范围,或将管道或选项卡更改为其他字符。
作为一个单行:
sed 's/|/\n/8; h; s/.*\n//; x; s/\n.*/\t/; s/|/\t/g; G; s/\n//; s/\(\(|[^|]*\)\{4\}\)$/\n\1/; h; s/.*\n//; s/|/\t/g; x; s/\n.*//; G; s/\n//'
在这里它被打破了。我对它进行了过度评论,因此很容易理解。
sed '
s/|/\n/8 # split
h # dup
s/.*\n//
# this is now the field which will retain the pipes
# plus the fields at the end of the record
x # swap
s/\n.*/\t/ # replace
s/|/\t/g
# this is now all the tab-delimited fields at the beginning of the record
G # append
s/\n//
# this is now the full record with the first part completed
# the rest of the steps are similar to the steps above
s/\(\(|[^|]*\)\{4\}\)$/\n\1/ # split
h # dup
s/.*\n//
s/|/\t/g #replace
# this is now the last four fields that have been tab delimited
x # swap
s/\n.*//
# this is the first eight fields plus the field with the retained pipes
G # append
s/\n//
# now print the full record with everything done
'
答案 2 :(得分:1)
当他需要这个时我与Dan一起工作,但意识到(像ghostdog74)AWK是一个更好的工具,但这是我可能效率低下的答案。
awk -F"|" 'BEGIN{OFS="\t"}{for (i=10; i < NF-3; i++) $9=$9 "|" $i; print $1,$2,$3,$4,$5,$6,$7,$8,$9,$(NF-3),$(NF-2),$(NF-1),$(NF)}' $file > $file.tab
你们有什么想法?
答案 3 :(得分:0)
Dennis是对的,您应该使用量词来指定您希望对其执行操作的模式的出现次数。
请查看“基本替换”下面的链接,因为它在网站上比在这里更具可读性: http://www.readylines.com/sed-one-liners-examples
希望有所帮助。