如何使用awk sed或shell脚本合并两个不同的行

时间:2018-10-06 07:40:40

标签: linux shell awk sed

我有一个文件file.txt。我必须将两个不同的行合并为一个。

file.txt

                  linux-
02-10-2018 11:50  is-a-opensource  user    file
02-10-2018 11:46  linux-userfile   user    file1
                                   user-1
02-10-2018 11:40  linux-userfile   user    file2
                  linux-           user-2
02-10-2018 11:30  linux-userfile   user    file3

预期产量

 02-10-2018 11:50  linux-is-a-opensource  user    file
 02-10-2018 11:46  linux-userfile         user    file1
 02-10-2018 11:40  linux-userfile         user1user    file2
 02-10-2018 11:30  linux-linux-userfile         user-2user    file3

任何建议将不胜感激。

我尝试了以下命令,但没有成功。

  $ awk ' /^ +/{ gsub(/^ +/," ");a=a $0; next }{ $2=$2a;a=""}1' file.txt 

我遇到错误

  02-10-2018 11:50 linux- is-a-opensource user file
  02-10-2018 11:46 linux-userfile user file1
  02-10-2018 11:40 user-1 linux-userfile user file2
  02-10-2018 11:30 linux-           user-2 linux-userfile user file3

,我尝试了以下链接作为参考,但仍然遇到相同的错误 How to Merge 2 diffrent lines in linux by using awk

How to merge two rows in a same row from a text file in linux shell script

1 个答案:

答案 0 :(得分:0)

由于很难确定字符串属于哪一列,因此我做出以下假设:

  • 这些列完全对齐且空格分隔

因此以下脚本将假定:

  • 不以日期开头的行将合并到下一个
  • 列的宽度由下一个行的列宽确定

注意:如果您的文件与空格对齐(制表符和空格的组合),我们将无法使用字段分隔符“ \ t”来区分字段,因为制表符的数量取决于在字段宽度上。

这是经过测试的脚本:

# If you have a tab-aligned file, replace all tabs by the
# correct sequence of spaces. In this example, we assume a single
# tab is equivalent to 8 spaces. Adopt if needed
{ gsub(/\t/,"        ",$0) }

# If the line does not start with a combination of numbers and hyphens
# it is a line that needs to be merged into the next line.
# store it and move to the next line
($1 !~ /^[-0-9]+$/) { tomerge=$0; next }

# If we picked up a tomerge line, try to figure out the fields
# by looking into the current line and determining the field widths.
(tomerge != "")  {
      # extract fields
      n=1 
      for(i=1;i<NF;++i) {
         m=index($0,$(i+1))
         field[i]=substr(tomerge,n,m-n)
         sub(/^[[:blank:]]*/,"",field[i])  # remove leading blanks
         sub(/[[:blank:]]*$/,"",field[i])  # remove trailing blanks
         n=m
      }
      field[NF]=substr(tomerge,n)
      # perform merging
      for(i=1;i<=NF;++i) $i= field[i] $i
      # reset the tomerge value
      tomerge=""
}
# print the line
{ $1=$1;print $0 }

输出:

$ awk -f script.awk file.txt
02-10-2018 11:50 linux-is-a-opensource user file
02-10-2018 11:46 linux-userfile user file1
02-10-2018 11:40 linux-userfile user-1user file2
02-10-2018 11:30 linux-linux-userfile user-2user file3

如果要对齐,可以将其传递给column -t

$ awk -f script.awk file.txt | column -t