Question

在这种情况下，我有一个管道分隔的文本文件，其中一个字段包含管道符。我已经有一个sed脚本，将其更改为制表符分隔，但问题是它非常慢。它将替换第一次出现的管道8次，然后将管道的最后一次出现更换4次。我希望有更快的方法来做我需要的事情。

任何想法都将不胜感激。这是我目前的sed脚本：

sed 's/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/|\(.*\)/\t/;s/\(.*\)|/\t/;s/\(.*\)|/\t/;s/\(.*\)|/\t/;s/\(.*\)|/\t/' $1 > $1.tab

谢谢，

-Dan

Answer 1

 sed 's/\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|/\1\t\2\t\3\t\4\t\5\t\6\t\7\t\8\t/;s/|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)|\([^|]\+\)$/\t\1\t\2\t\3\t\4/'

HTH

Answer 2

这有点可扩展，但它仍然是一个眼睛。您可以更改“8”和“4”以选择要替换的管道范围，或将管道或选项卡更改为其他字符。

作为一个单行：

sed 's/|/\n/8; h; s/.*\n//; x; s/\n.*/\t/; s/|/\t/g; G; s/\n//; s/\(\(|[^|]*\)\{4\}\)$/\n\1/; h; s/.*\n//; s/|/\t/g; x; s/\n.*//; G; s/\n//'

在这里它被打破了。我对它进行了过度评论，因此很容易理解。

sed '
s/|/\n/8     # split
h            # dup
s/.*\n//
# this is now the field which will retain the pipes 
# plus the fields at the end of the record
x            # swap
s/\n.*/\t/   # replace
s/|/\t/g
# this is now all the tab-delimited fields at the beginning of the record
G            # append
s/\n//
# this is now the full record with the first part completed
# the rest of the steps are similar to the steps above
s/\(\(|[^|]*\)\{4\}\)$/\n\1/    # split
h            # dup
s/.*\n//
s/|/\t/g     #replace
# this is now the last four fields that have been tab delimited
x            # swap
s/\n.*//
# this is the first eight fields plus the field with the retained pipes
G            # append
s/\n//
# now print the full record with everything done
'

Answer 3

当他需要这个时我与Dan一起工作，但意识到（像ghostdog74）AWK是一个更好的工具，但这是我可能效率低下的答案。

awk -F"|" 'BEGIN{OFS="\t"}{for (i=10; i < NF-3; i++) $9=$9 "|" $i; print $1,$2,$3,$4,$5,$6,$7,$8,$9,$(NF-3),$(NF-2),$(NF-1),$(NF)}' $file > $file.tab

你们有什么想法？

Answer 4

Dennis是对的，您应该使用量词来指定您希望对其执行操作的模式的出现次数。

请查看“基本替换”下面的链接，因为它在网站上比在这里更具可读性： http://www.readylines.com/sed-one-liners-examples

希望有所帮助。

使用sed替换文件中每一行的前8个和后4个管道

4 个答案: