Question

我使用sed脚本通过关键字将文件拆分为许多xml文件：

脚本：

#!/bin/sh
File=/home/spark/PktLog
count=0
line=(`sed -n '/?xml version="1.0" encoding/=' $File`)
num=${#line[@]}
for n in ${line[*]}
do
   [ $count -eq 0 ] && startLine=$n && continue
   let count+=1
   let endLine=n-1
   if [ $count -eq $num ]; then
      startLine=$n
      sed -n "${startLine},$ p" $File >result_${count}.txt
   else
      sed -n "${startLine},${endLine} p;q" $File >result_${count}.txt
      startLine=$n
   fi
done

但不拆分许多文件。我调试Shell脚本

spark@ubuntu:~$ sh -x split.sh 
+ File=/home/spark/PktLog
+ count=0
+ line=(`sed -n '/?xml version="1.0" encoding/=' $File`)
++ sed -n '/?xml version="1.0" encoding/=' /home/spark/PktLog
+ num=333
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=1
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=137
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=244
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=415
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=522
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=674
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=780
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=932
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'
+ startLine=1038
+ continue
+ for n in '${line[*]}'
+ '[' 0 -eq 0 ']'

如何解决该错误？谢谢！

Answer 1

获取某些内容的行号，以便可以反复遍历文件是一种反模式。 sed可以在一点帮助下做到这一点，但是为此切换到更高级别的工具更有意义。

awk '/\?xml version="1.0" encoding/ {
    if (f) close(f);
    f = "result_" ++i }
  { print >f }' "$File"

Answer 2

#!/bin/sh
File=/home/spark/PktLog
count=0
startLine=(`sed -n -e '/?xml version="1.0" encoding/=' $File`)
fileEnd=`sed -n '$=' $File`
endLine=(`echo ${startLine[*]} | awk -v a=$fileEnd '{for(i=2;i<=NF;i++) printf("%d ",$i-1);print a}'`)

let maxIndex=${#startLine[@]}-1

for n in `seq 0 $maxIndex`

do
    sed -n "${startLine[$n]},${endLine[$n]}p" $File >result_${n}.xml
done

echo $startLine[@]

sed：有关sed脚本分割文件的问题

2 个答案: