我的数据如下:
> sq1
foofoofoobar
foofoofoo
> sq2
quxquxquxbar
quxquxquxbar
quxx
> sq3
foofoofoobar
foofoofoo
> sq4
foofoofoobar
foofoo
我想在“> sqi”标题的基础上加入这些行作为截止行, 即屈服:
foofoofoobarfoofoofoo
quxquxquxbarquxquxquxbarquxx
foofoofoobarfoofoofoo
foofoofoobarfoofoo
我尝试使用此sed
但失败了:
sed '/^S/d;N;s/\n/\t/'
这样做的正确方法是什么?
答案 0 :(得分:3)
#!/bin/sed -f
# If this is a header line, empty it...
s/^>.*//
# ... and then jump to the 'end' label.
t end
# Otherwise, append this data line to the hold space.
H
# If this is not the last line, continue to the next line.
$!d
# Otherwise, this is the end of the file or the start of a header.
: end
# Call up the data lines we last saw (putting the empty line in the hold).
x
# If we haven't seen any data lines recently, continue to the next line.
/^$/d
# Otherwise, strip the newlines and print.
s/\n//g
# The one-line version:
# sed -e 's/^>.*//;te' -e 'H;$!d;:e' -e 'x;/^$/d;s/\n//g'
答案 1 :(得分:1)
你在线的开头测试一个大写字母“S”。你应该测试大于号的字符:
sed '/^>/d;N;s/\n/\t/'
或
sed '/^> sq/d;N;s/\n/\t/'
编辑:我错过了标题之间有可变行数的事实。这就是我到目前为止所做的:
sed -n '/^>/{x; p; d}; /^>/!H; x; s/\n/\t/; h; $p'
不幸的是,这留在标题中:
> sq1 foofoofoobar foofoofoo
> sq2 quxquxquxbar quxquxquxbar quxx
> sq3 foofoofoobar foofoofoo
> sq4 foofoofoobar foofoo
如果您从Bash提示符执行此操作,则可能必须先执行set +H
,以免由于感叹号而导致历史记录扩展干扰。
Edit2 :我修改后的版本摆脱了标题:
sed -n '1{x;d};/^>/{x; p; d}; H; x; s/\n/\t/; s/^>.*\t//; h; $p'
答案 2 :(得分:1)
原始问题的bash解决方案(即没有“标题”):
#!/bin/bash
text=[]
i=0
exec <$1
while read line
do
text[$i]=$line
let "i += 1"
done
j=0
len=0
while [ $j -lt ${#text[@]} ]
do
string=${text[$j]}
if [ $len -le ${#string} ] ; then
printf $string
else
printf $string'\n'
fi
len=${#string}
let "j += 1"
done
printf '\n'
答案 3 :(得分:1)
我找不到一种简单的方法在sed中做到这一点。无论如何,有了gawk / mawk,你只需要改变 RS变量和剪切换行符:
awk -v RS='> sq[0-9]' 'NR>1{gsub(/\n/,"");print}' file