使用awk组合行并编写csv文件

时间:2015-01-24 02:28:54

标签: bash awk

如果我的文件看起来像

Title: Title1
Author: Author1
Body: Body1.1
      Body1.2
      Body1.3

Title: Title2
Author: Author2
Body: Body2.1
      Body2.2
      Body2.3

等等。

我想输出

"Title1", "Author1", 
"Body1.1
 Body1.2
 Body1.3"

"Title2", "Author2",
"Body2.1
 Body2.2
 Body2.3" 

作为csv文件。我该怎么办?

另请注意,有时作者的姓名可能包含','所以我们要确保所有内容都是字符串格式

我现在尝试使用awk通过使用while循环来为我完成这项工作,但我确信应该有一种更简单的方法来执行此操作。

3 个答案:

答案 0 :(得分:0)

您可以使用此awk命令:

awk -F' *: *' '$1=="Title"{t=$2;if (b) print b;b="";next}
  $1=="Author"{printf "\"%s\", \"%s\"\n", t, $2;next}
  NF==1||$1=="Body"{sub(/^ +/, "", $1); b=(!b)? $2: b ORS $1;next}
  END{print b}' file
"Title1", "Author1"
Body1.1
Body1.2
Body1.3
"Title2", "Author2"
Body2.1
Body2.2
Body2.3

答案 1 :(得分:0)

这会产生您想要的输出。希望这些评论能够清楚地说明发生了什么。

$ cat script.awk    
BEGIN { FS="[:[:space:]]+" } # set field separator to one or more colons or space chars
/Title/ { t=$2 } # save title
/Author/{ printf "\"%s\", \"%s\",\n", t, $2 } # print title and author
/Body:/{ f=1; printf "\"%s", $2; next } # set f to true and print 1st body
!NF{ f=0; print "\"\n" } # empty line, set f to false
f{ printf "\n %s", $2 } # print body
END{ print "\"" } # print final quote
$ awk -f script.awk file
"Title1", "Author1",
"Body1.1
 Body1.2
 Body1.3"

"Title2", "Author2",
"Body2.1
 Body2.2
 Body2.3"

答案 2 :(得分:0)

根据输入数据,此gnu awkgnu RS)可能会有效:

awk -vRS= '{print "\""$2"\", \""$4"\",\n\""$6"\n "$7"\n "$8"\"\n"}' t
"Title1", "Author1",
"Body1.1
 Body1.2
 Body1.3"

"Title2", "Author2",
"Body2.1
 Body2.2
 Body2.3"

通过将记录选择器设置为空,它将每个数据块作为一个记录进行处理,然后我们只需要我们需要的字段数。