如果我的文件看起来像
Title: Title1
Author: Author1
Body: Body1.1
Body1.2
Body1.3
Title: Title2
Author: Author2
Body: Body2.1
Body2.2
Body2.3
等等。
我想输出
"Title1", "Author1",
"Body1.1
Body1.2
Body1.3"
"Title2", "Author2",
"Body2.1
Body2.2
Body2.3"
作为csv文件。我该怎么办?
另请注意,有时作者的姓名可能包含','所以我们要确保所有内容都是字符串格式
我现在尝试使用awk通过使用while循环来为我完成这项工作,但我确信应该有一种更简单的方法来执行此操作。
答案 0 :(得分:0)
您可以使用此awk命令:
awk -F' *: *' '$1=="Title"{t=$2;if (b) print b;b="";next}
$1=="Author"{printf "\"%s\", \"%s\"\n", t, $2;next}
NF==1||$1=="Body"{sub(/^ +/, "", $1); b=(!b)? $2: b ORS $1;next}
END{print b}' file
"Title1", "Author1"
Body1.1
Body1.2
Body1.3
"Title2", "Author2"
Body2.1
Body2.2
Body2.3
答案 1 :(得分:0)
这会产生您想要的输出。希望这些评论能够清楚地说明发生了什么。
$ cat script.awk
BEGIN { FS="[:[:space:]]+" } # set field separator to one or more colons or space chars
/Title/ { t=$2 } # save title
/Author/{ printf "\"%s\", \"%s\",\n", t, $2 } # print title and author
/Body:/{ f=1; printf "\"%s", $2; next } # set f to true and print 1st body
!NF{ f=0; print "\"\n" } # empty line, set f to false
f{ printf "\n %s", $2 } # print body
END{ print "\"" } # print final quote
$ awk -f script.awk file
"Title1", "Author1",
"Body1.1
Body1.2
Body1.3"
"Title2", "Author2",
"Body2.1
Body2.2
Body2.3"
答案 2 :(得分:0)
根据输入数据,此gnu awk
(gnu
RS
)可能会有效:
awk -vRS= '{print "\""$2"\", \""$4"\",\n\""$6"\n "$7"\n "$8"\"\n"}' t
"Title1", "Author1",
"Body1.1
Body1.2
Body1.3"
"Title2", "Author2",
"Body2.1
Body2.2
Body2.3"
通过将记录选择器设置为空,它将每个数据块作为一个记录进行处理,然后我们只需要我们需要的字段数。